Closed mha1 closed 1 year ago
en: about 0.15s silence after spoken text de: about 0.8s silence after spoken text - this makes german SF play value veeerrrryyy sloooowwww
A played value of 23.7 Volt will take about 2-3 seconds for en, but about 6-7 seconds for de. Please adjust the silent phase accordingly.
Here's a comparison of the en/0023.wav vs de/0023.wav:
Is this the release files, or the repo files... as the repo files are the raw source files from Azure TTS - they're trimmed as part of the release step in release.sh
release.sh
good point - this is the repo files
en: about 0.15s silence after spoken text de: about 0.8s silence after spoken text - this makes german SF play value veeerrrryyy sloooowwww
A played value of 23.7 Volt will take about 2-3 seconds for en, but about 6-7 seconds for de. Please adjust the silent phase accordingly.
Here's a comparison of the en/0023.wav vs de/0023.wav: