Open chavinlo opened 1 year ago
works great for unconditional generation, the sound is not as crisp as the original model (as far as i've got in training) and it does not overfit as your training script, it keeps the variety of the initial dataset. the resulting finetuned model does not work well for prompt-conditioned generation as far as i've played with
tried with a100 at 12 batchsize with 40mins of audio stereo -> 160 30s chunks, used the same dataset for train and eval (?
Hi, the part i copied from your repo works great i was also surprised that it does not overfit. Either it's a bug in your dataloader because it's the only thing i've really replaced or it's just the complexity of conditional training.
The best way to test it would be to take my code and replace the static prompt with your input prompts.
Hello, cool to see that other people are also interested in training musicgen.
I see that you used part of my repo, however my repo doesnt really works (imo) and breaks the model, as the outputs sound choppy and glitchy. Does yours have the same issue?