audio-captioning / dcase-2020-baseline

Audio captioning baseline system for DCASE 2020 challenge.
http://dcase.community/challenge2020/task-automatic-audio-captioning
Other
37 stars 11 forks source link

Question regarding pretrained weights #12

Closed gretatuckute closed 3 years ago

gretatuckute commented 3 years ago

Hi! Thank you for this great repository. I have a quick question regarding the pretrained weights for the baseline system (https://zenodo.org/record/3697687#.YKJp7-spCuo). I do not see this mentioned explicitly (sorry if I missed it), but were these weights obtained from training on the training split of Clotho v1 or v2? The results here are reported as v1?

thanks so much! Greta

dr-costas commented 3 years ago

Hi Greta,

Thank you for the message! Indeed, the current results are for Clotho v1. I hope that during this week we will be releasing the results for v2.

I tried to explicitly mention this at the baseline for DCASE

The results of the baseline system for the development dataset are (with Clotho V1, will be updated for Clotho v2):

But I totally understand that it is not always that easy to find info in big walls of text. I had similar experience too many times myself as well :D

Let me know if the above is OK for you or something more is needed. :)

gretatuckute commented 3 years ago

Thanks so much for the quick reply, appreciate it! Great, so results are from v1, which also means that the pretrained weights are trained on the development split of Clotho v1? Which consists of 2893 audio clips with 14465 captions, right?

dr-costas commented 3 years ago

Yes.

gretatuckute commented 3 years ago

Beautiful, thank you!