orcasound / aifororcas-orcaml

Code for data preparation, training and evaluation of AI underlying Pod.Cast and OrcaHello projects.
MIT License
3 stars 3 forks source link

Add rationale for 2.45 second window to README #4

Open scottveirs opened 3 years ago

scottveirs commented 3 years ago

A good question was raised on a call with Canadian open source collaborators today (HALLO project, #ai4orcas-hallo in Orcasound Slack), some of whom have been experimenting with different window durations in developing a binary classifier for SRKW+Bigg's+NRKW+offshore ecotypes of killer whales in the NE Pacific (with habitat in BC, Canada, coastal environments):

Why did Pod.Cast and OrcaHello elect to use a 2.45 second window?

It would be ideal to recall the rationale and add it to the README.MD file.

On the call, I said I thought it was due to the statistics of SRKW call duration, but I'm not seeing the 2.45 second (or 2450 millisecond) value in Orcasound's shared spreadsheet of SRKW.

scottveirs commented 3 years ago

Do you recall the (2019, Pod.Cast?) rationale @akashmjn ?

bnestor commented 2 months ago

Perhaps I can weight in. In call data I have annotated, the 66th percentile is about 1.7 seconds. Hence 2.5 seconds would probably be enough to capture most calls in their entirety. Also, the model used here is a ResNet model, which requires squishing the spectrogram down into a square image (usually 224 by 224). Using the default NFFT window in the dataloader, that is about the amount of time you can get for ~200 frequency bands.