How to prevenet multi speaker

neonbjb / tortoise-tts

A multi-voice TTS system trained with an emphasis on quality

Apache License 2.0

12.35k stars 1.73k forks source link

How to prevenet multi speaker #555

Open nto4 opened 11 months ago

nto4 commented 11 months ago

hi, im trying to read some articles using tortoise tts but this model some times generating audio 2 different person speaking how can i prevent this ? i wanto to force tortoise read all text use same voice any body can help me ?

tts.tts_with_preset(parts[i], voice_samples=voice_samples, conditioning_latents=conditioning_latents, preset=preset)

manmay-nakhashi commented 11 months ago

@nto4 are you using random voices ?

nto4 commented 11 months ago

@manmay-nakhashi hi, thanks for response, no ım usıng selected voice actually ım usıng fork of thıs repo for speed up https://github.com/manmay-nakhashi/tortoise-tts-fastest and ım usıng my audio sample for voice clonning

manmay-nakhashi commented 11 months ago

@nto4 I have seen sometimes it changes the voice , with some tuning in autoregressive temperature you can reduce that effect , I think.

nto4 commented 11 months ago

thank you for answer @manmay-nakhashi can u provide me example because of ım very new in that field and just tryıng to learn

nto4 commented 10 months ago

Im still facing this problem sometimes model decide use 2 speakers how can ı force use one speakers (im using this function and using my voice samples for voice clonning tts_withpreset i will be grate full if you can help me @manmay-nakhashi can u any advice ?

manmay-nakhashi commented 10 months ago

@n8bot you can consider it as one of the limitations of the model you can reduce it to some extent but it won't go away, fine-tuning further on cleaner data may reduce it further.

onlinerender commented 9 months ago

I am also facing the same problem. Can anyone help me? If I use a long text, there are points in the middle where a new voice is introduced

fakerybakery commented 8 months ago

Hi @onlinerender, when using long text, the quality drops significantly. Please consider splitting up the text into chunks and generating those instead.

ranjana-creator commented 4 months ago

@fakerybakery can you suggest a cut off for number of words in a chunk?