The implementation here focuses on binary detection of orca calls (that are in the audible range, hence fun to listen to and annotate :) ) We change the audio-preprocessing front-end to better match this task & fine-tune the fully-connected layers and classification head of the AudioSet model, specifically a PyTorch port of the model/weights. The model is generating local predictions on a fixed window size of ~2.45s. Sampling and aggregation strategies for more global detection at minute/hourly/day-wise time scale would be a welcome contribution (helpful for a real-time detection pipeline, or processing 2-3 months of historical data from different hydrophone nodes).
- The model was bootstrapped with scraped open data from WHOI Marine Mammal Database (see
src.scraper
andnotebooks/DataPreparation
for details)- Labelled data in live conditions from Orcasound hydrophones has subsequently been added using the Pod.Cast tool by prioritizing labelling in an active-learning-like fashion after the initial bootstrap. (DataArchives details on all datasets)
- The mel spectrogram generation is changed to better suit this task (for details on choice of filterbank see
notebooks/DataPreparation
. Implementation is indata_ml/src.params
anddata_ml/src.dataloader
)- Given limited domain data, and need for robustness to different acoustic conditions (hydrophone nodes, SNR, noise/disturbances) in live conditions, the baseline uses transfer learning.
- Data augmentation in the style of SpecAug is also implemented, that acts as a helpful form of regularization
- data_ml (current directory)
train.py
test.py
- src (module library)
- notebooks (for evaluation, data preparation)
- tools
- models
- runs
- live_inference (deploy trained model)
See documentation at DataArchives for details on how to access and read datasets in a standard form.
This is a convenience script to download & uncompress latest combined training (Round1,2,3 etc.) & test datasets.
python data_ml/tools/download_datasets.py <LOCATION> (--only_train/--only_test)
Pardon the brevity here, this is just a rough starting point, that will evolve significantly! Some of the code is still pretty rough, however src.model
and src.dataloader
are useful places to start.
Training converges quite fast (~5 minutes on a GPU). Train/validation tensorboard logs & model checkpoints are saved to a directory in runRootPath
.
python train.py -dataPath ../train_data -runRootPath ../runs/test --preTrainedModelPath ../models/pytorch_vggish.pth -model AudioSet_fc_all -lr 0.0005
See notebook Evaluation.ipynb
(might be pretty rough, but should give a general idea)
<model-download-location>
python tools/prepare_test_and_model_data.py --test_path <test-data-download-dir> --model_path <model-download-dir>
python data_ml/test.py --test_path <test-data-download-location> --model_path <model-download-location>
[Windows] Get pyenv-win to manage python versions:
git clone https://github.com/pyenv-win/pyenv-win.git %USERPROFILE%/.pyenv
%USERPROFILE%\.pyenv\pyenv-win\bin
, %USERPROFILE%\.pyenv\pyenv-win\shims
[Mac] Get pyenv to manage python versions:
brew update && brew install pyenv
pyenv init
command to your shell on startup [Common] Install and maintain the right Python version (3.6.8)
pyenv --version
to check installation pyenv rehash
from your home directory, install python 3.6.8 with pyenv install -l 3.6.8
(use 3.6.8-amd64 on Windows if relevant) and run pyenv rehash
again /PodCast
and set a local python version pyenv local 3.6.8
(or 3.6.8-amd64). This saves a .python-version
file that tells pyenv what to use in this dir python --version
and check you're using the right one(feel free to skip 1, 2, 3 if you prefer to use your own Python setup and are familiar with many of this)
python -m venv podcast-venv
. This creates a directory podcast-venv
with relevant files/scripts. source podcast-venv/bin/activate
and when you're done, deactivate
On Windows, activate with .\podcast-venv\Scripts\activate.bat
and .\podcast-venv\Scripts\deactivate.bat
when done/data_ml
and run python -m pip install --upgrade pip && pip install -r requirements.txt