DrCoffey / DeepSqueak

DeepSqueak v3: Using Machine Vision to Accelerate Bioacoustics Research
BSD 3-Clause "New" or "Revised" License
372 stars 89 forks source link

birds vocalizations #76

Closed ayamarck closed 4 years ago

ayamarck commented 4 years ago

Hi! I very much enjoyed reading the article about DeepSqueak. I'm studying vocal communication of the White Spectacled Bulbul, and thought that it would be interesting to use the program. I was wondering if there are clustering models that are suitable for bird vocals? And do you know if the system can work on 0-10kHz frequency instead of 0-100kHz? perhaps you know of someone who has tried it and has some tips or insights? Thank you for your time, Aya

DrCoffey commented 4 years ago

Hey, if you could send an audio clip (.wav) with a few calls I can give you better advice.

ayamarck commented 4 years ago

Hey, thank you so much for getting back to me. here are some audio clips. its not the best but this is the kind of data that we have for now.

thanks! Best, Aya

‫בתאריך יום ה׳, 27 בפבר׳ 2020 ב-0:25 מאת ‪DrCoffey‬‏ <‪ notifications@github.com‬‏>:‬

Hey, if you could send an audio clip (.wav) with a few calls I can give you better advice.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/DrCoffey/DeepSqueak/issues/76?email_source=notifications&email_token=AOU5W6AG7XNP7E3VXEW52GTRE3T4DA5CNFSM4K3KDOX2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENCEFOI#issuecomment-591676089, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOU5W6D3WYW2LGJAZQ6VDHDRE3T4DANCNFSM4K3KDOXQ .

ayamarck commented 4 years ago

Hey, Did you received the files? i sent them also via email.

Thanks!

ayamarck commented 4 years ago

files link- https://drive.google.com/open?id=1woLkHyA00GVhcmr8JyA-wlpftqvyEpKZ

DrCoffey commented 4 years ago

Hey Aya,

I got the files, but things are a little crazy in Seattle at the moment. We are running on a skeleton crew amid the Covid-19 outbreak. I'll get to it when I can.

-Kevin

DrCoffey commented 4 years ago

Hey Aya,

Finally got around to looking at your files. I think it would be possible to train a network, but it was hard for me to pick out the beginning and end of individual calls. I think the network might also suffer from that issue. In order to train a network you are going to need to do a few things. First you need to import your example calls into the DeepSqueak Format, I would use Raven to hand box the calls and save a selection table. Different versions of Raven don't have consistent headers etc, so you can just modify your selection tables to match the headers in the example file attached. Just make sure you change the call time from mm:ss to just seconds. Then use the Import from Raven function to load the calls from the selection table and corresponding full audio file (You will need to change the display range in the tools to see things well).

You should use the full audio file and not the audio clips to generate the DeepSqueak file. During training DeepSqueak will want a lot of "no-call" background to improve negative rejection.

Then you can save the DeepSqueak formatted .mat file and move on to training: https://github.com/DrCoffey/DeepSqueak/wiki/training-detection-networks

I can help you with training once you modify the table and import the calls.

-Kevin

HumanVoice.Table.1.selections.txt