starrytong / SCNet

MIT License
60 stars 7 forks source link

Model increase #15

Open lucellent opened 2 months ago

lucellent commented 2 months ago

Is it possible to increase model size even beyond "large"? For example by adding a new 512 band? Or if not, what are other strategies to maximise the possible model size

starrytong commented 2 months ago

You can increase the number of frequency bands from 3 to 4, but the hyperparameters might become more complex. Alternatively, you can directly increase the channel dimension.

lucellent commented 2 months ago

You can increase the number of frequency bands from 3 to 4, but the hyperparameters might become more complex. Alternatively, you can directly increase the channel dimension.

I'm using zfturbo's script and config, does that mean:

conv_depths: -4

Or something else? I already tried

compress: 2 conv_kernel: 5 num_dplayer: 8 expand: 1

but I think it was too much for a 3090 GPU

starrytong commented 2 months ago

num_dplayer: 8 is acceptable, but you need to reduce the audio length or batch size during training.

lucellent commented 2 months ago

Okay got it, thank you. I tried training from existing large checkpoint with the larger config but seems like it might be better to train a whole new model with the new config, then finetune with better dataset

Zfturbo mentioned this is what worked for him. Also let me know if there are other parameters I can adjust to improve the model or simply num dplayer is enough (I will try to double it, from 6 to 12)