microsoft / Pengi

An Audio Language model for Audio Tasks
https://arxiv.org/abs/2305.11834
MIT License
284 stars 15 forks source link

Evaluation code #5

Closed zhifengkong closed 11 months ago

zhifengkong commented 11 months ago

Hi,

I'd like to ask if you will release the evaluation code on standard benchmarks (such as clotho and audiocaps)? And what temperature / num_beams did you use to obtain the results? Thanks!

soham97 commented 11 months ago

Hi @FengNiMa, the repo hosts only evaluation code, and the subsequent release of evaluation dataloaders is not planned. The downstream task evaluation uses a beam size of 5 and a temperature of 1.0. This is kept the same for all downstream tasks.

Hope this helps and feel free to email me with any questions.

zhifengkong commented 11 months ago

Thanks!