Closed Subarasheese closed 10 months ago
@Edresson
Hi @Subarasheese, thanks for reporting this bug. We plan to fix this issue soon. As work around I noticed that if you add a space between the word and the point it will fix the issue.
Hi @Subarasheese, thanks for reporting this bug. We plan to fix this issue soon. As work around I noticed that if you add a space between the word and the point it will fix the issue.
Thank you. I have a question, out of curiosity: can the dataset used to train the Portuguese model be found online, or did Coqui use a private/internal dataset for Portuguese?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.
A similar error exists in other languages, such as French, Russian and Japanese. The problem appears in model xtts_v1.1, coqui 0.19.0, python 3.11.5.
@Edresson The workaround (space before dot) is not working on xtts v2... It is still saying "dot" (ponto) Previusly the workaround worked every time, if I recall correctly
We don't actually know why it happens. If anyone has any ideas, let us know
I experienced the same problem with xtts-v2 using the german language.
We don't actually know why it happens. If anyone has any ideas, let us know
Are you guys sure there isn't an issue with the dataset? What were your sources?
I'm also getting 'ponto' when fine tunning.
I used the example code and read the text from a file. I installed Coqui TTS yesterday, so it is still overwhelming right now. The sound file is attached. At one point you can hear: "Punkt dot" It quite often happens that there are long gaps between sentences. Not sure if there is a connection to this issue?
# -*- coding: utf-8 -*-
import sys
from pathlib import Path
import torch
from TTS.api import TTS
f = open(sys.argv[1], 'rb').read()
f = f.decode('unicode_escape').encode('latin-1').decode('utf-8')
print (f)
file_output = sys.argv[2]
# Get device
device = "cuda" if torch.cuda.is_available() else "cpu"
# List available 🐸TTS models
#print(TTS().list_models())
# Init TTS
tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2").to(device)
# Run TTS
# ❗ Since this model is multi-lingual voice cloning model, we must set the target speaker_wav and language
# Text to speech list of amplitude values as output
wav = tts.tts(text= f, speaker_wav="Data/RefClips/4.wav", language="de")
# Text to speech to a file
tts.tts_to_file(text=f, speaker_wav="Data/RefClips/4.wav", language="de", file_path=file_output)
Temporarily it is possible to fix this problem by replacing dots "." with exclamations "!"
Temporarily it is possible to fix this problem by replacing dots "." with exclamations "!"
In general, the use of ".." instead of ".", also works for Portuguese language.
Italian has the same issue. Except for workarounds, did you find a stable fix?
".." method does not work. Neither "!".
Thanks
PS: with italian works replacing "." with "\n"
This bug is still present at least for italian. Another workaround is to replace . with ;
We have the same issue in french
In Czech (xtts_v2 model) try replacing "." with ";\n" - this will make the ends of sentences sound more natural.
Does anyone have a solution to the problem?.
Solution : Replacing the full stops(.) in the text with "|" works for the portuguese language also it adds a pause after the sentence ends. Using space instead of full stop doesnt add a pause. However using a text with "|" instead of full stops won't work for longer text so use shorter text prompt less than 400 tokens with "|".
Describe the bug
Hello,
It seem a bit of a "oopsie" was made when handling the Portuguese dataset as now the PTBR pronounces the "." character as ponto every time we insert sentences like:
"Olá, sou seu novo clone de voz. Faça o possível para carregar um áudio de qualidade."
Here is the output: https://vocaroo.com/1404xnr0Vkmc
It was not supposed to say "ponto"...
It goes like:
"Olá, sou seu novo clone de voz ponto Faça o possível para carregar um áudio de qualidade ponto"
But it should not be like that.
To Reproduce
Set the client to portuguese (pt) then type anything including "." (dot)
Expected behavior
Not pronouncing dot. The purpose of "." is to indicate the end of a declarative sentence or to separate certain elements in written text.
Logs
Environment
Additional context
No response