Great work! I'd like to inquire where I can find the address for Audio-Language Alignment data. I noticed in scripts/audio_language/train.sh that there is a mention of 4,800,000 instances of audio-language data, which seems to be significantly more than the 1 million mentioned in the paper. Could you please provide information on where to download this data for easier replication of the paper's results?
Hi Dear Author,
Great work! I'd like to inquire where I can find the address for Audio-Language Alignment data. I noticed in
scripts/audio_language/train.sh
that there is a mention of 4,800,000 instances of audio-language data, which seems to be significantly more than the 1 million mentioned in the paper. Could you please provide information on where to download this data for easier replication of the paper's results?Thank you!