suno-ai / bark

🔊 Text-Prompted Generative Audio Model
MIT License
35.96k stars 4.24k forks source link

Male Female Choice? #193

Closed djbritt closed 1 year ago

djbritt commented 1 year ago

Is the only way to choose gender to do

MEN: Phrase WOMEN: Phrase

Or are there any alternative ways?

gkucsko commented 1 year ago

the more robust way is to use a history_prompt that is male or female. or randomly generate a bunch of outputs with output_full=True. when you get one that you like you can save it and use it as a history prompt

djbritt commented 1 year ago

How do you use history_prompt to specify male or female sorry?

gkucsko commented 1 year ago

history prompts are just encoded audio, you can always load them with numpy np.load and then use encodec to listen to the audio audio_arr = codec_decode(numpy_archive["fine_tokens"])

djbritt commented 1 year ago

Sorry, you're speaking a bit outside of my knowledge level.

Can you expand on this statement you made?

"the more robust way is to use a history_prompt that is male or female"

djbritt commented 1 year ago

Hi there, did you see my last question?

gkucsko commented 1 year ago

listen to the examples here: https://suno-ai.notion.site/8b8e8749ed514b0cbf3f699013548683?v=bc67cff786b04b50b3ceb756fd05f68c then pick a speaker that you like and do a generation like this: generate_audio(text, history_prompt="v2/en_speaker_3")

bmwas commented 1 year ago

@gkucsko Sorry expanding a little on this question. I have a dialog text between two individuals and I want to use (Male and Female Voice). I did select Male: en_speaker_3 and Female: en_speaker_9 based on history prompts provided by suno. However, regardless the generated audio completely disregards the female voice. In other words, all that comes out is male voice. Anything I'm doing wrong? Thank you! Here is my speaker lookup. speaker_lookup = {"Sonia": "en_speaker_9","John": "en_speaker_6"}

gkucsko commented 1 year ago

in general ya that can happen because the model is trained to just continue audio. meaning there can always be a new speaker rather than the same one continuing. that said, try the v2/ prompts. they should be a bit more stable for continuing the same voice

langenhagen commented 1 year ago

Also, fyi, bark appears to be content sensitive. Thus, while generating speaker voices myself, I give it a sample sentence that indicates info about the speaker. E.g., in order to get a female voice, let bark say something like:

Hello, my name is Monica, the quick brown fox jumps over the lazy dog.

In order to get a male voice, I give bark:

Hello, my name is Peter, the quick brown ...

Works most of the time.