Pacific Sound 16kHz broke on 2024-08-24

pmhalvor commented 1 month ago

Since the 16kHz are no longer being updated, we need to adjust the audio step to download from a different source. The best way to go about this is to build a new PTransform class the expects the same inputs and has the same outputs, but instead reads the 254kHz data.

These new data are going to be a lot more dense, so a resampling may new to be done as well, to reduce storage space. The model only allows 10kHz anyway, so resampling earlier rather than later would be smart.

EOL of 16kHz

aws s3 ls --no-sign-request s3://pacific-sound-16khz/2024/08/

New 254kHz format

aws s3 ls --no-sign-request s3://pacific-sound-256khz-2024/09/

The latest data here are from the 19th of September. I want to reach out to the engineers working on this project to see if these data are going to be changed again (or potentially removed), or if there is a backfill currently taking place.

This issue will be marked as blocked until further confirmation on next steps have been decided.

pmhalvor commented 1 month ago

This MBARI notebook PacificSound256kHzTo2kHzDecimate shows how to decimate a signal recorded w/ a high sample rate to lower sample rates, following techniques proposed in R. Crochiere and L. Rabiner, L, "Optimum FIR Digital Implementations for Decimation, Interpolation, and Narrow-Band Filtering", IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 23, no. 5, pp. 444-456, October 1975, doi: 10.1109/TASSP.1975.1162719..

"For large changes in sampling rate, however, it is generally more efficient to reduce the sampling rate with a series of decimation stages rather than making the entire rate reduction with one stage. In this way the sampling rate is reduced gradually resulting in much less severe filtering requirements on the low-pass filters at each stage. Bellanger et al. [5] and Nelson et al. [8] also have implemented sampling rate reductions using several decimation stages; however they restricted their results by only using factors of 2 at each stage."

pmhalvor commented 1 month ago

The hydrophone has been offline since it failed earlier this fall, but will be fixed by the end of next month. So no data will be available between September and November.

pmhalvor / whale-speech

Pacific Sound 16kHz broke on 2024-08-24 #27

EOL of 16kHz

New 254kHz format