suno-ai / bark

🔊 Text-Prompted Generative Audio Model
MIT License
35.22k stars 4.13k forks source link

Fine tuned prompting. #536

Open arthurwolf opened 6 months ago

arthurwolf commented 6 months ago

I've searched a lot for information about this, I expected to find guides or users sharing tips, or some official documentation, but found only the minimal stuff in the docs:

How do you modify the tone of voice / give more precise instructions on how to read?

Is there no information about this because it's impossible? Was there no such data/instructions in the training data? Is there some place one can look at the training data (even a sample of it) ?

What I'm looking to do is things like "[whispering] are you certain?" or "[loud][surprised] what ??". But just doing it this way results in bark not following the tags (or very rarely), and often just producing random noises.

Am I missing something? Any help would be greatly appreciated! Cheers.

JonathanFly commented 3 months ago

If you are very lucky, metatags like that can work with a voice. But the reliable way to do this is to make voice variants for each emotion you want. Basically, try a bunch of emotional written prompts until you get one that actually is whispering or surprised. Save the voice as a new version of your original voice.