I've been playing with this for a few days and still wondering
1) how much data is needed to fine-tune a particular genre? 10 hours of music or more for an easy genre?
2) should we finetune for the whole 250 epochs? How do we know when we should stop fine-tuning?
It really depends on what the genre is, but from my experience if the style is consistent enough (metal for example) even 10 hours can work. The metal model that is available in the Huggingface demo was finetuned on about 100 hours of metal music.
No, you can usually stop the finetuning process much sooner (5-10 epochs). When finetuning on especially small datasets, I found that generation quality actually dramatically worsens after just a couple of epochs. You can listen to generated samples while finetuning and stop the process when you are satisfied with the results!
Hi, thanks for your excellent work.
I've been playing with this for a few days and still wondering 1) how much data is needed to fine-tune a particular genre? 10 hours of music or more for an easy genre? 2) should we finetune for the whole 250 epochs? How do we know when we should stop fine-tuning?
Can you give us some guidance?
Thank you.