LAION-AI / CLAP

Contrastive Language-Audio Pretraining
https://arxiv.org/abs/2211.06687
Creative Commons Zero v1.0 Universal
1.24k stars 122 forks source link

caption text at training #138

Open TATEXH opened 5 months ago

TATEXH commented 5 months ago

Hi I am using music_audioset_epoch_15_esc_90.14.pt as a music classifier. I would like to classify the mood and genre of our music files. I am trying to find the cosine similarity using the text "The mood of this song is (romantic, energetic, etc)" but I only get about 0.4. I think that if I use a text similar to the one you used in your training, the value will be better, so could you please tell me what type of text you used?

lukewys commented 3 months ago

Hi, for music we used This audio is a <genre> song. I think the task you are dealing with is also a bit of out of distribution of the training data. I don't think we included a lot of music with mood labels in music version of the CLAP.

Best,

TATEXH commented 3 months ago

Thanks for the reply. I will try it with your text.