orcasound / aifororcas-orcaml

Code for data preparation, training and evaluation of AI underlying Pod.Cast and OrcaHello projects.
MIT License
2 stars 3 forks source link

annotations do not match audio #5

Open nkundiushuti opened 1 year ago

nkundiushuti commented 1 year ago

I listened the audios in TrainDataLatest_PodCastAllRounds_123567910.tar.gz (wav subfolder) and visualized the annotations and realized that they do not match. Take for instance 60012.wav and the first annotations podcast_round1 60012.wav 34.126 2.918 Dabob Bay, Seattle, Washington 1960-10-28 60012 podcast_round1 60012.wav 36.816 2.588 Dabob Bay, Seattle, Washington 1960-10-28 60012 podcast_round1 60012.wav 42.55 2.055 Dabob Bay, Seattle, Washington 1960-10-28 60012 podcast_round1 60012.wav 44.606 2.41 Dabob Bay, Seattle, Washington 1960-10-28 60012 podcast_round1 60012.wav 46.636 3.425 Dabob Bay, Seattle, Washington 1960-10-28 60012 podcast_round1 60012.wav 51.381 3.248 Dabob Bay, Seattle, Washington 1960-10-28 60012 you will see that they onsets and offsets do not match exactly the start of the vocalizations and there are vocalizations also outside these time intervals. it looks a bit random tbh

bnestor commented 3 weeks ago

Hi, I have found that the annotations for other rounds are more accurate. I have discarded podcast round 1 altogether.