My roc_auc is higher than reported

pengbo-learn commented 3 years ago

I train harmonicnn on MTAT, where MTAT is downloaded from https://github.com/Spijkervet/CLMR/blob/master/clmr/datasets/magnatagatune.py I trained 4 times, the evaluation outputs are: loss: 0.1400 roc_auc: 0.9151 pr_auc: 0.4636

loss: 0.1399 roc_auc: 0.9157 pr_auc: 0.4666

loss: 0.1405 roc_auc: 0.9155 pr_auc: 0.4617

loss: 0.1403 roc_auc: 0.9153 pr_auc: 0.4643

loss: 0.1402 roc_auc: 0.9148 pr_auc: 0.4641

the roc_auc is much higher than the reported 0.9126. Did I miss something ?

minzwon commented 3 years ago

Hi,

That's interesting. There can be many reasons. Sometimes it's because of a different PyTorch version (for example, once I experienced a similar thing after they update batch normalization in PyTorch). Or It can be a different data split. Can you double-check if your data split is identical to mine? I included lists of train/valid/test sets in this repo.

pengbo-learn commented 3 years ago

I did use your data split provided in split/mtat.

My environment: torch==1.2.0, torchaudio==0.3.0

Some of my MTAT mp3 files are 0 size, it may be the root of problem. Could you provide your way to download the MTAT mp3s, so I could make sure the data source is identical.

minzwon commented 3 years ago

To make sure, the scores that you mentioned are the "test set" score, not the "validation set", right?

I used the dataset that already existed in my university's cluster. I believe it's collected from the original web page (https://mirg.city.ac.uk/codeapps/the-magnatagatune-dataset).

pengbo-learn commented 3 years ago

To make sure, the scores that you mentioned are the "test set" score, not the "validation set", right?

I used the dataset that already existed in my university's cluster. I believe it's collected from the original web page (https://mirg.city.ac.uk/codeapps/the-magnatagatune-dataset).

yeah, the results are generated by running python -u eval.py --data_path YOUR_DATA_PATH

Thanks a lot, I will report the scores when experiments are finished

pengbo-learn commented 3 years ago

The md5 values of MP3 files downloaded from https://mirg.city.ac.uk/codeapps/the-magnatagatune-dataset and https://github.com/Spijkervet/CLMR/blob/master/clmr/datasets/magnatagatune.py are the same.

To make sure I did not modify your original implementation, I git cloned the repo and tried again.

I did modify the preprocessing script mtat_read.py. 6/norine_braun-now_and_zen-08-gently-117-146.mp3 is 0 size, which make the program crash with EOFError. Therefore, I catch the error and pass.

Is this file in your MTAT data 0 size?

minzwon commented 3 years ago

@pengbo-learn The file size is 0 for me as well. Can you run the experiment with another model, please? If you experience the performance gain in another model, for sure, this improvement comes from the different PyTorch versions.

pengbo-learn commented 3 years ago

You are right, other models do better as well.

harmonicnn loss: 0.1405 roc_auc: 0.9142 pr_auc: 0.4658

fcn loss: 0.1405 roc_auc: 0.9006 pr_auc: 0.4347

musicnn loss: 0.1461 roc_auc: 0.9112 pr_auc: 0.4520

minzwon commented 3 years ago

Possibly it's a version issue then. Thank you for reporting this and I will close the issue.

minzwon / sota-music-tagging-models

My roc_auc is higher than reported #5