jordipons / musicnn

Pronounced as "musician", musicnn is a set of pre-trained deep convolutional neural networks for music audio tagging.
ISC License
590 stars 87 forks source link

train/valid/test split for MagnaTagATune #6

Closed expectopatronum closed 4 years ago

expectopatronum commented 4 years ago

Hi, thanks for sharing the pretrained models! I am currently trying to figure out which train/valid/test split you used. You linked to this repo where I can't find any information about splits. They only mention a split of 13:1:3 which I first thought would be the folders but there are 16 folders and not 17 ... Did you process the tags as proposed there?

I found this split but there are only 15244 train samples and you mentioned that your version of the dataset contains ~19000 train samples.

I assume you downloaded the audio and csv with tags from here.

I hope you can shed some light on my confusion :) Thanks! Best regards, Verena

jordipons commented 4 years ago

In the musicnn-training repository you have code to create the MTT partition: https://github.com/jordipons/musicnn-training/blob/master/aux/mtt/partition_gt.py (line 24).

It is important to know that the MTT partition we use is different from the one Jongpil Lee used: https://github.com/jongpillee/music_dataset_split/tree/master/MTAT_split. I clarify this issue in our ISMIR 2018 paper: http://ismir2018.ircam.fr/doc/pdfs/191_Paper.pdf (section 5.2).

And yes! I downloaded the audio from here: http://mirg.city.ac.uk/codeapps/the-magnatagatune-dataset