Open hollandloprabbit opened 2 weeks ago
I haven't tried training a large model without AMP. May I ask what batch size you used? Perhaps you could try further lowering the learning rate.
The batch size I used is 4, and your paper states that the learning rate is 5e-4 when using only the MUSDB18 dataset. But in the config file you provided, the learning rate is 3e-4. May I ask, what is the appropriate learning rate to adjust when training a large version of the model?
A learning rate of 5e-4 corresponds to the standard version of SCNet. In the configuration file I provided, a learning rate of 3e-4 corresponds to the large version. However, I trained with AMP and if you don't use it, the learning rate might need to be further reduced.
I was training SCNet-large (without using mixed precision training), and the loss and metric suddenly dropped in the middle of the training. How should I handle this situation?