LAION-AI / CLAP

Contrastive Language-Audio Pretraining
https://arxiv.org/abs/2211.06687
Creative Commons Zero v1.0 Universal
1.36k stars 133 forks source link

Introduction to the use of datasets for the provided pre-trained models #107

Open XinMing0411 opened 1 year ago

XinMing0411 commented 1 year ago

Hi all! What are the datasets used in the pre-trained model provided in the Google link? Were 630k-audioset-best.pt and 630k-audioset-fusion-best.pt pre-trained models trained only on the Audioset and LAION-Audio-630k datasets? Do their training data include the Clotho and AudioCaps datasets?

lukewys commented 1 year ago

Yes, in addition to audioset and LAION-audio-630K, clotho and auiocaps are also used to train the model. Please refer to our paper appendix for details.

dilipupf commented 1 year ago

I have my own custom evaluation dataset similar to clotho evaluation dataset for which i just want to do inference on the pretrained checkpoint and compute the recall metric. how can i do it? i tried reading through the documentation it's a bit confusing to me. Appreciate your help!