microsoft / Pengi

An Audio Language model for Audio Tasks
https://arxiv.org/abs/2305.11834
MIT License
269 stars 15 forks source link

About the audio-text pair of AudioSet dataset. #16

Open blue-blue272 opened 3 weeks ago

blue-blue272 commented 3 weeks ago

AudioSet only contains audio and event labels. How do you obtain the caption description for audios in the audioset dataset?

soham97 commented 2 weeks ago

Hi @blue-blue272, we use two ways to get captions for AudioSet: