Hey friends, we recently release some text-to-audio models that are very similar to the great work you all are doing here. Maybe some of them can be used as a starting point for finetuning within this repo (esp semantic and coarse/fine).
https://github.com/suno-ai/bark
Hey friends, we recently release some text-to-audio models that are very similar to the great work you all are doing here. Maybe some of them can be used as a starting point for finetuning within this repo (esp semantic and coarse/fine). https://github.com/suno-ai/bark