Evaluation scores do not seem to improve during training

sophia1488 commented 2 years ago

Hi, I modified this config so that there'd be 2 targets (vocals & accompaniment). (https://github.com/bytedance/music_source_separation/blob/master/scripts/4_train/musdb18/configs/accompaniment-vocals%2Cresunet_subbandtime.yaml) I also changed batch size from 16 to 12, and the MUSDB I use is HQ dataset (.wav) Those are the only modifications I did.

But given the evaluation scores during training, I'm not sure it'll reach 16.x & 8.x for accompaniment and vocals at step 500001

Step: 0, accompaniment: -0.606, vocals: -2.908
Step: 10000, accompaniment: 2.662, vocals: 0.399
Step: 20000, accompaniment: 2.680, vocals: 0.451
Step: 30000, accompaniment: 2.702, vocals: 0.498
Step: 40000, accompaniment: 2.726, vocals: 0.518
Step: 50000, accompaniment: 2.719, vocals: 0.539

Thanks in advance!

sophia1488 commented 2 years ago

[Update] If the target is set to only one stem, a significant improvement can then be observed at step 10000 (in my case, from -1.6 to 1x) So it seems that different models are needed for different stems... I was originally hoping that vocal and accompaniment separation can be done with only 1 model though

playdasegunda commented 2 years ago

Hello how are you? Sorry to bother you, you managed to complete the training of MSS bytedance, if yes, it would be possible to release the synthedance checkpoints models that you trained to community we would be immensely grateful,

From now we thank you for your return,

Yours sincerely, Lucas Rodrigues.

bytedance / music_source_separation

Evaluation scores do not seem to improve during training #28