reanthony-usgs / US_Noise_PSDs

3 stars 1 forks source link

Verify compatibility with LANL #14

Open aringler-usgs opened 5 years ago

aringler-usgs commented 5 years ago

Once we verify the phase calculation we will need @jkmacc-LANL to verify he is happy with the structure of compute_psd.py (currently awaiting a pull-request).

We can let @jkmacc-LANL when we are ready.

jkmacc-LANL commented 5 years ago

Hi all,

I'm here. Should I be watching this repo for a PR, but @aringler-usgs 's fork for development of any modification from the meeting?

Thanks!

-J

aringler-usgs commented 5 years ago

@jkmacc-LANL we will go off of @reanthony-usgs repository. He is going to check my changes before we send it your way. I included the phase calculations that Keith wanted. I will also add a re-request mechanism.

Hopefully, we can pass it your way next week.

reanthony-usgs commented 5 years ago

@jkmacc-LANL - I've verified that the PSDs from Adam's modified version of the code are what we would expect. Feel free to clone the repository now and check out compute_PSD.py to see if the structure will work with Helm and your other AWS tools.

jkmacc-LANL commented 5 years ago

Sorry for the delay; I was on travel this week.

I had a moment to look at the code. Thank you for sharing it. It makes sense to me, but I have a few questions:

  1. Is everything Keith would need for computing polarizations in there? He just needs the mean cross spectral densities for the hour? I thought he was taking means of the spectral matrix for each subwindow.
  2. Similar to the question above, is storing the PSDs as 10.*np.log10(np.abs(cxy)) also ok for his application?
  3. Current outputs are pickled arrays for each hour(?). Are you open to alternate file formats, as long as I can provide an environment.yaml and code snippets to retrieve the arrays?

In regards to cloud deployment, I see where all the main processing functions and parameters are, so those can be easily re-used, but the I/O won't look the same. My main question is about whether spectral means across subwindows will work for everyone. I've been using scipy.signal.spectrogram, with comparable parameters to yours, to get the subwindow spectra so we can calculate and store mean, median, min, max, etc. across them before they're freed. If Keith needs these subwindows, we'll need to get a python function from him I think.

Thanks again!

aringler-usgs commented 5 years ago

Not a problem.

1) Yes, we have updated the code to do cross-spectra.

https://github.com/reanthony-usgs/US_Noise_PSDs/blob/master/compute_psd.py#L137

You switch the channels. So chan1=BHZ chan2=BH1

2) I think this is the most natural unit. It puts the calculations between -180 and -60. If you want to go back to raw power you can do 10**(p/10). However, we could also keep them linear.

3) We are open to an alternate format. You could either make a pr or if you explain what you want we could make the changes.

We can save the raw spectra as well as the various statistics. We will want the raw values. Does that work? These are what we were saving as a pickle.

Let us know what you want us to change. Or how we can move forward.

jkmacc-LANL commented 5 years ago

Thanks @aringler-usgs . If hourly mean PSDs in dB for all channel pairs works for everyone, it works for me, too. I can start on the conversion, and I may pop up for questions. Just TA in Alaska for 2018?

Also, what was decided about how the data were ultimately to be distributed? Is it too early to worry about that?

Thanks!

aringler-usgs commented 5 years ago

I think we are a go. Why don't we see how the comparison/method works for TA 2018 and then we can figure out how we will address the distributing issue.

Let us know what you need on our end for getting it on amazon. Or if you want us to do anything particular to help you out with that. We can break functions up into modules or similar if that is helpful.

jkmacc-LANL commented 4 years ago

@aringler-usgs and @reanthony-usgs I just wanted to let you know that, after a short break due to some travel, I'm actively working on this again; sorry for the delay. I'm working through a number of implementation steps that deviate from the procedure here, simply due to the details of distributing the data access and computation. The parameters and approach is the same as yours, with one exception. mlab.csd takes 1D arrays, but the nearly identical scipy.spectral.csd operates down an axis of an N-D array. The latter is valuable, as it allows us to formulate a single FFT plan for all three channels of data, since they're identically treated. I've confirmed that both algorithms correct for the windowing function. Do you have any concern about using SciPy's function?

aringler-usgs commented 4 years ago

Hello @jkmacc-LANL not a problem. I think the next month is a bit hectic for everyone.

I don't see a problem with using scipy.spectra.csd

Is there anything we can help with to make the necessary changes easier for you?

aringler-usgs commented 4 years ago

BTW: @jkmacc-LANL I will be at AGU if you want to discuss anything further.

jkmacc-LANL commented 4 years ago

Thanks, @aringler-usgs . I'll not be at AGU, but I don't foresee any further issues. Your repo is all the blueprint I need. Thanks again.