orcasound / ambient-sound-analysis

This repository aims to hold code for UW MSDS capstone project analyzing ambient sounds in orcasound hydrophone data
MIT License
4 stars 4 forks source link

Setup a test case for ground truth validation of the underlying math #39

Closed vaibhavmehrotraml closed 5 months ago

ben-hendricks commented 7 months ago

HI, @scottveirs could you identify a time period (probably a calendar day) and a node that would serve us well for a ground-truthing experiment? Ideally low system noise and few/no data gaps. Thx!

scottveirs commented 7 months ago

Hey @ben-hendricks, here are some possible answers:

Any strong preferences from the team? @vaibhavmehrotraml @zprice12 @ttan06 @veirs @valentina-s

scottveirs commented 7 months ago

More specific targets from searching the new Orcasound human reports page:

  1. port-townsend 03/22/23, 08:43:39 PM -- J pod calls audible under ferry noise.
  2. orcasound-lab 03/25/23, 09:57:40 AM -- I can really empathize with those whales when the boats are picked up by the hydrophones, wow is that loud!, Overlapping slow clicks, Orcas
  3. sunset-bay 12/27/23, 01:28:33 AM -- calls in ship noise, likely from NB bulker 11.8kt (MMSI 311000736), Repeated call., S01 calls, almost overlapping

If folks want the HLS segments converted to e.g. mp3 in 6-hour chunks, I have a script ready to generate them for a calendar day... Or @veirs or @valentina-s may have better scripts handy if shorter clips are desired...

ben-hendricks commented 7 months ago

@scottveirs : great. One last question ... @veirs provided me with a script to compile .wav files from your database (downloadWavs.py). The script requires a "hydro" parameter, specifying the data source. For orcasound-lab, this was "ORCASOUND_LAB". Are the identifiers for the other potential sources (port-townsend & sunset-bay) using the same rule (i.e. "PORT_TOWNSEND" and "SUNSET_BAY")? Will have a look tomorrow. Thx.

veirs commented 7 months ago

Hi Ben, Check the top of the file I sent you. There you see the defs:

Enum for orcasound hydrophones, including AWS S3 Bucket info """

HPhoneTup = namedtuple("Hydrophone", "name bucket ref_folder save_bucket save_folder")

BUSH_POINT = HPhoneTup("bush_point", "streaming-orcasound-net", "rpi_bush_point", "acoustic-sandbox", "ambient-sound-analysis/bush_point") ORCASOUND_LAB = HPhoneTup("orcasound_lab", "streaming-orcasound-net", "rpi_orcasound_lab", "acoustic-sandbox", "ambient-sound-analysis/orcasound_lab") PORT_TOWNSEND = HPhoneTup("port_townsend", "streaming-orcasound-net", "rpi_port_townsend", "acoustic-sandbox", "ambient-sound-analysis/port_townsend") SUNSET_BAY = HPhoneTup("sunset_bay", "streaming-orcasound-net", "rpi_sunset_bay", "acoustic-sandbox", "ambient-sound-analysis/sunset_bay") SANDBOX = HPhoneTup("sandbox", "acoustic-sandbox", "ambient-sound-analysis", "acoustic-sandbox", "ambient-sound-analysis")

.

On Wed, Jan 17, 2024 at 7:44 PM Ben Hendricks @.***> wrote:

@scottveirs https://github.com/scottveirs : great. One last question ... @veirs https://github.com/veirs provided me with a script to compile .wav files from your database (downloadWavs.py). The script requires a "hydro" parameter, specifying the data source. For orcasound-lab, this was "ORCASOUND_LAB". Are the identifiers for the other potential sources (port-townsend & sunset-bay) using the same rule (i.e. "PORT_TOWNSEND" and "SUNSET_BAY")? Will have a look tomorrow. Thx.

— Reply to this email directly, view it on GitHub https://github.com/orcasound/ambient-sound-analysis/issues/39#issuecomment-1897736588, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFXKXIC57YAPYB7JGDE75DYPCLAHAVCNFSM6AAAAABB4CUPGOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJXG4ZTMNJYHA . You are receiving this because you were mentioned.Message ID: @.***>

valentina-s commented 7 months ago

A few snapshots I saved during our discussion on different psd generation methods:

Screenshot 2024-01-19 at 12 49 34 PM Screenshot 2024-01-19 at 12 46 23 PM
ben-hendricks commented 7 months ago

Hi team,

I found the problem in wav_to_array: In essence, there is a misunderstanding of the "hop_length" parameter for librosa.stft(). The function uses hop_length to assign delta_t (the number of seconds per sample). In fact, however, hop_length is a parameter that defines the overlap between adjacent NFFTs.

In our example, you would set NFFT to produce 1 Hz resolution, which requires an NFFT equivalent of 1s of samples. Since the hop length is set to 60s, the code literally jumps from one 1s sample to the next one 60s later. hop_length should be set as a constant to NFFT/2 instead (50% overlap between samples).

The code then needs to implement an averaging scheme. This can be done either by a) computing a spectrogram matrix with NFFT set to produce 1 Hz frequency resolution, then averaging the matrix over a minute of time. b) computing a spectrogram matrix with NFFT set to produce 60s time resolution, then averaging the matrix over 1 Hz frequencies. My code does a) by cutting the wav file to 60s increments, then compute the spectrogram matrix of the entire snippet, then average the whole thing.

ben-hendricks commented 7 months ago

The above comment will likely fix the difference in smoothing. My suspicion for the offset is that you apply the function librosa.amplitude_to_db() to convert linear to dB units. This uses a reference = np.max() which I am not sure what it does. Moreover, librosa may be design for sound in air, not water, which uses a different reference frame to convert to dB.

I believe if you have a close look at these two aspects of the code we may be able to converge on a solution quickly.

That being said, if the data is not calibrated, we can ignore constant offsets in our results, as the absolute value is meaningless anyways.

Good luck!

scottveirs commented 7 months ago

Looking with Val at the scikit-maad repository, we found that they use numpy fft methods when computing Sound Pressure Levels, rather than librosa. Look around line 634 here -- https://github.com/scikit-maad/scikit-maad/blob/production/maad/spl/conversion_SPL.py

scottveirs commented 7 months ago

Also all -- see Val's latest query on the other test thread, but maybe best to combine issues and continue our (longer) discussion here?

scottveirs commented 7 months ago

Adding Ben's proposed validation scheme (from email thread), which I like a lot:

I suggest taking some more fundamental steps first within a ground-truthing scheme:

First, I would create PSDs with 1 Hz resolution. Binning them into 1/3 octave bands can be done later from this basic file (and is practically a conversion to SPLs anyways.

Then, we can decide on a suitable averaging interval and compare a single PSD for which no calibration has been applied. This will confirm the basic code that translates waveform audio into PSD. Finally, we should confirm that doing this for some extended stretch of time will produce the desired spectral pattern. Whether or not vessels or whales are in the data does not matter much at this point.

After that, I would feel confident to start computing metrics etc. from those, including 1/3 octave bands, ancient ambients …

As an example, I downloaded the full day for Port Townsend and computed

a) a PSD (no calibration applied) using a 60s average at 1 Hz resolution for 2023-03-22 11:05:25

OLAB_PTTD_CSTM-01_sample

b) a long term spectral average for which I computed 60s averages at 1Hz resolution and color-coded the intensity for the entire 2023-03-22

LTSA_OSND_PTTD-01_CSTM0001_2023-03-22

c) a Spectral Probability Plot which counts the occurrences of sound levels per frequency bin for all 60s average PSDs for the entire 2023-03-22

SPD_OSND_PTTD-01_CSTM0001_2023-03

b) and c) are nice ways to understand your data, but first we should all be able to compute an identical plot a). All plots are attached.

The PSDs are computed with a Hanning Window and 50% NFFT overlap.

We can discuss playing around with integration time, but I suggest we at least start of with a 1 Hz resolution to gain the best understanding of the data

scottveirs commented 7 months ago

@vaibhavmehrotraml @ttan06 @zprice12 -- Can you let us know in this thread when you've got some test files organized here -- https://github.com/orcasound/ambient-sound-analysis/tree/test_psd/test_files?

I'd like to see test WAV file(s) built from .ts segments using this repo's code base, rather than the code that Val shared with Ben. Later it might be a good exercise to see if that code also generates the same audio file.

Maybe the minute Ben chose is a good start (Port Townsend, HLS data, 1 minute starting at 2023-03-22 11:05:25), but I'm curious to know if you'd recommend other test candidates, too.

ben-hendricks commented 7 months ago

Just adding to the thread here: I will work today on uploading the basic SSA (SoundSpace Analytics) code that produces a), b), and c) above. Besides that, I am on standby until you decided how to fix your wav_to_array function (or replace it with one of @veirs older code. Let me know if you need more guidance for that process.

zprice12 commented 7 months ago

a)

Screen Shot 2024-01-24 at 10 15 33 AM

b)

Screen Shot 2024-01-24 at 10 21 47 AM

@scottveirs @ben-hendricks We tried implementing your suggestions into the pipeline. Both graphs above are the same comparison we did in our meeting between Ben's code and this repo's code. a uses the original code, b uses the original code except hop_length is set to nfft/2. We're attempting to implement Ben's solution of computing a spectrogram matrix with NFFT set to produce 1 Hz frequency resolution, then averaging the matrix over a minute of time. However, we're a bit confused on how to best incorporate this into the existing code. We looked into some of the parameters in librosa's STFT, such as window length but that doesn't seem to be what we need. Could you elaborate further on how we could implement an averaging scheme into the existing code?

ben-hendricks commented 7 months ago

Hi @zprice12 , ok, the first step is fixed. The result should not look much different (yet) because you are still looking at only a short-time PSD snippet of length NFFT which is currently set to 1 second (= int(samplerate)) to produce 1 Hz resolution. Except in b) your resulting matrix should be much bigger because your are hopping 2x per second, not 1x per minute. I assume the result you have at hand is a pandas dataFrame with timestamp as either a column or the index. You should be able to use the pandas resample() method on that dataFrame to create averages of any desired length: https://towardsdatascience.com/resample-function-of-pandas-79b17ec82a78

Beware that the averaging should be done in linear space before the code applies a log transformation. So you either move the amplitude_to_db() function further down the pipe, or you reverse the operation, then average, then re-reverse.

@scottveirs I am not 100% sure how you like to have the responsibilities split here. Is this something the students should code, or do you want me to rewrite wav_to_array(), so the students can work with a physically correct product?

scottveirs commented 7 months ago

@scottveirs I am not 100% sure how you like to have the responsibilities split here. Is this something the students should code, or do you want me to rewrite wav_to_array(), so the students can work with a physically correct product?

Hey @ben-hendricks -- My preference is for you and Val (and other mentors) to comment here and let the student team understand and adjust the open code in their repository accordingly. I hope soon we will all come to believe it is generating physically-correct products -- i.e. generating equivalent PSD arrays to yours and Val's from the same input data...

In addition to comments and code snippets shared here, you and Val could both optionally share parts of your own larger code bases (e.g. via personal Github repos), especially if they are licenses in ways that the students can leverage (i.e. add to their open MIT-licensed code here in this repo). For example Kaitlin Palmer is open sourcing some of her efforts to generate Merchant-style MATLAB underwater SPL metrics and visualizations using R here -- https://github.com/JPalmerK/YAWN_functions -- and I hope we can invite her soon to give the students a tour of her progress to date!

@JPalmerK -- feel free to comment on your open source experience so far, or otherwise provide additional context!

scottveirs commented 7 months ago

Both graphs above are the same comparison we did in our meeting between Ben's code and this repo's code.

@zprice12 Can you point to or share a link to the "test" file generated by your methods to this directory so @ben-hendricks and @veirs can be sure they are starting from the same audio data?

I think @valentina-s intended for such test audio data to be shared in this branch and location? https://github.com/orcasound/ambient-sound-analysis/tree/test_psd/test_files

zprice12 commented 7 months ago

Hi @zprice12 , ok, the first step is fixed. The result should not look much different (yet) because you are still looking at only a short-time PSD snippet of length NFFT which is currently set to 1 second (= int(samplerate)) to produce 1 Hz resolution. Except in b) your resulting matrix should be much bigger because your are hopping 2x per second, not 1x per minute. I assume the result you have at hand is a pandas dataFrame with timestamp as either a column or the index. You should be able to use the pandas resample() method on that dataFrame to create averages of any desired length: https://towardsdatascience.com/resample-function-of-pandas-79b17ec82a78

Beware that the averaging should be done in linear space before the code applies a log transformation. So you either move the amplitude_to_db() function further down the pipe, or you reverse the operation, then average, then re-reverse.

@scottveirs I am not 100% sure how you like to have the responsibilities split here. Is this something the students should code, or do you want me to rewrite wav_to_array(), so the students can work with a physically correct product?

  • Ben

Thanks for the help @ben-hendricks. We will do our best to incorporate this method successfully into the existing code prior to the meeting on Friday so we can continue with this validation.

zprice12 commented 7 months ago

Both graphs above are the same comparison we did in our meeting between Ben's code and this repo's code.

@zprice12 Can you point to or share a link to the "test" file generated by your methods to this directory so @ben-hendricks and @veirs can be sure they are starting from the same audio data?

I think @valentina-s intended for such test audio data to be shared in this branch and location? https://github.com/orcasound/ambient-sound-analysis/tree/test_psd/test_files

Hey @scottveirs, we don't currently have test wav files to upload to the repo due to how the code is set up right now. Our understanding is that the generate_psds function in the pipeline file calls the get_next_clip function from https://github.com/orcasound/orca-hls-utils/blob/master/src/orca_hls_utils/HLSStream.py. This then takes the ts files for the relevant time period and converts them into temporary wav files in 10min batches, which are then processed into psds and concatenated together. We're not 100% sure that this is what is occurring, so please let us know if we're misunderstanding. Overall, this makes it difficult for us to produce and upload wav files of chosen length. We could probably adjust the code rather easily to concatenate the wav files into larger time lengths however.

Also, to produce the data used for this test, I ran: pipeline = NoiseAnalysisPipeline(Hydrophone.PORT_TOWNSEND, pqt_folder='pqt', delta_f=1, bands=None, delta_t=60) psd_path, broadband_path = pipeline.generate_parquet_file(dt.datetime(2023, 3, 22, 11), dt.datetime(2023, 3, 22, 12), upload_to_s3=False)

zprice12 commented 6 months ago
Screen Shot 2024-02-06 at 7 58 50 PM

@veirs @scottveirs broadband for 3-22-2023 10am-4pm 1min avgs. How's this looking? Not sure we're doing the integration/averaging completely correct but believe we're close. Seeing those ~20db routine changes due to ship passages

zprice12 commented 5 months ago

Validation complete. Bands could still use a more thorough check but should work