Spijkervet / CLMR

Official PyTorch implementation of Contrastive Learning of Musical Representations
https://arxiv.org/abs/2103.09410
Apache License 2.0
309 stars 48 forks source link

Weights for million song dataset? #2

Open NotNANtoN opened 3 years ago

NotNANtoN commented 3 years ago

Hi,

Thanks for releasing this nice repository!

Do you plan on releasing the weights for the linear classifier trained on the million song dataset too? I would be very happy to use it in my work because I would care about these more abstract "happy" or "sad" classes.

On another note: I might be too stupid to see it, but I could not find an easy way to assign the predictions of the linear classifier trained on the magnatagatune dataset to their corresponding labels. In the paper, you say that you choose the top 50 most common labels. Is there a list somewhere here in the repository for it or can I look it up on the dataset site? I do not want to mess up the order of the labels for obvious reasons...

Thanks again and kind regards, Anton

NotNANtoN commented 3 years ago

I found the tags for magnatagatune myself, here they are: ['guitar', 'classical', 'slow', 'techno', 'strings', 'drums', 'electronic', 'rock', 'fast', 'piano', 'ambient', 'beat', 'violin', 'vocal', 'synth', 'female', 'indian', 'opera', 'male', 'singing', 'vocals', 'no vocals', 'harpsichord', 'loud', 'quiet', 'flute', 'woman', 'male vocal', 'no vocal', 'pop', 'soft', 'sitar', 'solo', 'man', 'classic', 'choir', 'voice', 'new age', 'dance', 'male voice', 'female vocal', 'beats', 'harp', 'cello', 'no voice', 'weird', 'country', 'metal', 'female voice', 'choral']

I got them from https://github.com/minzwon/sota-music-tagging-models/raw/master/split/mtat/tags.npy, which was listed in the Magnatagatune dataset class.

I'd still be happy to hear your thoughts on releasing the MSD trained classifier!

Spijkervet commented 3 years ago

Hi there!

I have to re-train the MSD classifier as I lost the weights on the lab computer. I will let you know once I have them!

NotNANtoN commented 3 years ago

That would be so great, thank you! :)

By the way, I have experimented with the magnatagatune tagger and got a bit unstable results. Here you can see the prediction for the tag "piano" for the song "Gnossienne No. 1". I basically just split the song into overlapping chunks of 2.7s and fed it into the model:

image

This instability holds for any label. Did you by any chance make similar experiences? I was hoping to use the predicted tags to generate images according to the current "mood" in the song, but unfortunately this is not stable enough for my purpose. Maybe I messed up some part of the processing?