Elleo / pied

Pied makes it simple to install and manage text-to-speech Piper voices for use with Speech Dispatcher.
https://pied.mikeasoft.com
GNU General Public License v3.0
121 stars 4 forks source link

Long pauses between sentences #9

Open tarsobcaldas opened 7 months ago

tarsobcaldas commented 7 months ago

I'm not sure if it's a speech dispatcher problem or if it's something with the configuration, but when I run the following command, I get fairly short pauses between sentences, but when I run the phrase with spd-say, the pauses become very long, disrupting the flow of the reading.

echo "Then I realized that far from being lost, the details of these beers \
had been carefully stored in archives and brewery store rooms across Britain. \
Discovering the secrets of these lost beers was a possibility. All that was \ 
required was a bit of effort and determination."  | \
piper -m /home/noaxp/.var/app/com.mikeasoft.pied/data/pied/models/en_US-lessac-high.onnx \
--output_raw  | aplay -r 22050 -f S16_LE -t raw -

I've tried changing the configuration file piper.conf to include the flag --sentence_silence, but it doesn't seem to have any effect whatsoever, not to make it shorter or longer.

Still haven't checked how it's working with other output modules.

KAGEYAM4 commented 6 months ago

@tarsobcaldas did you found any solution, i am having the same problem.

tarsobcaldas commented 6 months ago

Not yet, unfortunately

KAGEYAM4 commented 6 months ago

@tarsobcaldas found solution, it works for me.

source -> https://github.com/ken107/read-aloud/issues/375#issuecomment-1937517761

This is my config, for reference ->

piper.conf

DefaultVoice "en/en_GB/alan/medium/en_GB-alan-medium.onnx"

# Specifying a rarely used symbol & big limit so that speech-dispatcher doesn't cut text into chunks:
GenericDelimiters "˨"
GenericMaxChunkLength 1000000

# These lines are important to specify for every language you'll use, otherwise some characters will not work:
GenericLanguage "en" "en-us" "utf-8"
#GenericLanguage "en" "en-gb" "utf-8"
#GenericLanguage "ru" "ru" "utf-8"

GenericCmdDependency "sox"
GenericCmdDependency "aplay"

GenericExecuteSynth \
"echo '$DATA' | /usr/bin/piper-tts --model '/usr/share/piper-voices/$VOICE' --output_raw | sox -r 22050 -c 1 -b 16 -e signed-integer -t raw - -t wav - tempo $RATE pitch $PITCH norm | aplay -r 22050 -f S16_LE -t raw -"

GenericRateAdd 1
GenericPitchAdd 1
GenericVolumeAdd 1
GenericRateMultiply 1
GenericPitchMultiply 1000

# Adding all voices we want:
#AddVoice "en" "FEMALE1" "en/en_GB/jenny_dioco/medium/en_GB-jenny_dioco-medium.onnx"
#AddVoice "en" "MALE1" "en/en_GB/alan/medium/en_GB-alan-medium.onnx"
#AddVoice "en" "FEMALE1" "en/en_GB/semaine/medium/en_GB-semaine-medium.onnx"
#AddVoice "en" "FEMALE1" "en/en_US/hfc_female/medium/en_US-hfc_female-medium.onnx"
#AddVoice "en" "FEMALE1" "en/en_GB/alba/medium/en_GB-alba-medium.onnx"
#AddVoice "en" "FEMALE1" "en/en_US/amy/medium/en_US-amy-medium.onnx"
#AddVoice "ru" "MALE1" "ru/ru_RU/dmitri/medium/ru_RU-dmitri-medium.onnx"

AddVoice "en" "MALE1" "en/en_US/ryan/high/en_US-ryan-high.onnx"

speechd.conf

AddModule "piper" "sd_generic" "piper.conf"
DefaultModule piper
LanguageDefaultModule "en" "piper"
tarsobcaldas commented 5 months ago

Yes, it seems that adding these lines solves the problem:

GenericDelimiters "˨"
GenericMaxChunkLength 1000000
KAGEYAM4 commented 5 months ago

even with the above, i was still facing delay when new paragraph starts. i switched to this - https://github.com/brailcom/speechd/issues/866#issuecomment-1869106771 --- make sure you are using medium model for this.

mak448a commented 4 months ago

@tarsobcaldas Could you reopen this issue? The solution was only a workaround. The file in there says "GENERATED BY PIED," which means that it can probably be fixed on pied's side.