minzwon / sota-music-tagging-models

MIT License
397 stars 64 forks source link

the small data size training. #12

Closed Usernamezhx closed 2 years ago

Usernamezhx commented 2 years ago

Thanks for your work. I trained the musicnn as your suggestion because I only have 1K train data. I finetune the musicnn last ten layer with the pretrained model you provided which is trained with MSD dataset. the result show me that:

屏幕快照 2021-11-02 下午4 33 14 屏幕快照 2021-11-02 下午4 33 18

Does that look right ? thanks in advace.

minzwon commented 2 years ago

It looks like your model is overfitted to your training data. I recommend you to fine-tune only the last 1~2 dense layers and also try strong data augmentation for your inputs. You can find a useful data augmentation tool here.

https://github.com/Spijkervet/torchaudio-augmentations.git

Usernamezhx commented 2 years ago

It looks like your model is overfitted to your training data. I recommend you to fine-tune only the last 1~2 dense layers and also try strong data augmentation for your inputs. You can find a useful data augmentation tool here.

https://github.com/Spijkervet/torchaudio-augmentations.git

thanks for your reply. I will try it.

Usernamezhx commented 2 years ago

I only try last dense layer.maybe it works. but the pr and roc is low than the model trained all layer.

屏幕快照 2021-11-03 下午4 48 15