eheikes / tts

Tools to convert text to speech :books::speech_balloon:
Apache License 2.0
94 stars 19 forks source link

Language options seems to be uneffective with Polly #59

Closed ThiemoCh closed 4 years ago

ThiemoCh commented 4 years ago
  1. What is the exact command you are running? For your security, please "X" out any AWS access keys and secrets.

echo "Wie geht es dir?" | tts tests.mp3 --region eu-central-1 --language de-DE

The voice remains with a strong english accent, regardless what language option I choose. Much different from the listening to the language samples on the aws-polly webpage.

  1. What result are you seeing in the console? Copy & paste the exact output you get, with debugging turned on (see the Troubleshooting section for how to enable debugging).

C:\Users\Tim\Desktop\tts-master>echo "Wie geht es dir?" | tts tests.mp3 --region eu-central-1 --language de-DE tts-cli called with arguments {"_":["tests.mp3"],"region":"eu-central-1","language":"de-DE"} +0ms tts-cli input: null +4ms tts-cli output: tests.mp3 +2ms readText Reading from stdin +0ms readText Finished reading (21 bytes) +15ms chunkText Chunked into 1 text parts +0ms splitText Stripping whitespace +0ms generateSpeech Options: {"ffmpeg":"ffmpeg","format":"mp3","language":"de-DE","limit":5,"region":"eu-central-1","type":"text","voice":"Joanna"} +0ms create Creating AWS Polly instance in eu-central-1 +0ms generateAll Requesting 1 audio segments, 5 at a time +0ms generate Opening output stream to C:\Users\Tim\AppData\Local\Temp\4b15cfa8-0852-445a-96ea-2823a2a0a301.mp3 +0ms generate Making request to https://polly.eu-central-1.amazonaws.com/v1/speech?LanguageCode=de-DE&OutputFormat=mp3&Text=%22Wie%20g...&TextType=text&VoiceId=Joanna&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=...&X-Amz-Date=20200501T132156Z&X-Amz-Expires=1800&X-Amz-Signature=8...&X-Amz-SignedHeaders=host +0ms generate Closing output stream +0ms generateAll Requested all parts, with error null +0ms createManifest Creating C:\Users\Tim\AppData\Local\Temp\f10afdc3-b7f5-4d22-ac47-3fafe240fbb4.txt for manifest +0ms createManifest Writing manifest contents: createManifest file 'C:\Users\Tim\AppData\Local\Temp\4b15cfa8-0852-445a-96ea-2823a2a0a301.mp3' +0ms combine Combining files into C:\Users\Tim\AppData\Local\Temp\a30bfdd2-a32c-4533-b1f5-88a5e177813b.mp3 +0ms combineEncodedAudio Running ffmpeg -f concat -safe 0 -i C:\Users\Tim\AppData\Local\Temp\f10afdc3-b7f5-4d22-ac47-3fafe240fbb4.txt -c copy C:\Users\Tim\AppData\Local\Temp\a30bfdd2-a32c-4533-b1f5-88a5e177813b.mp3 +0ms combineEncodedAudio combineEncodedAudio ffmpeg version git-2020-05-01-39fb1e9 Copyright (c) 2000-2020 the FFmpeg developers combineEncodedAudio built with gcc 9.3.1 (GCC) 20200328 combineEncodedAudio configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libsrt --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --disable-w32threads --enable-libmfx --enable-ffnvcodec --enable-cuda-llvm --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt --enable-amf combineEncodedAudio combineEncodedAudio libavutil 56. 43.100 / 56. 43.100 combineEncodedAudio libavcodec 58. 82.100 / 58. 82.100 combineEncodedAudio libavformat 58. 42.101 / 58. 42.101 combineEncodedAudio libavdevice 58. 9.103 / 58. 9.103 combineEncodedAudio libavfilter 7. 80.100 / 7. 80.100 combineEncodedAudio libswscale 5. 6.101 / 5. 6.101 combineEncodedAudio libswresample 3. 6.100 / 3. 6.100 combineEncodedAudio libpostproc 55. 6.100 / 55. 6.100 combineEncodedAudio combineEncodedAudio [mp3 @ 000001cd6adc5900] Estimating duration from bitrate, this may be inaccurate combineEncodedAudio combineEncodedAudio Input #0, concat, from 'C:\Users\Tim\AppData\Local\Temp\f10afdc3-b7f5-4d22-ac47-3fafe240fbb4.txt': combineEncodedAudio Duration: N/A, start: 0.000000, bitrate: 48 kb/s combineEncodedAudio combineEncodedAudio Stream #0:0: Audio: mp3, 22050 Hz, mono, fltp, 48 kb/s combineEncodedAudio combineEncodedAudio Output #0, mp3, to 'C:\Users\Tim\AppData\Local\Temp\a30bfdd2-a32c-4533-b1f5-88a5e177813b.mp3': combineEncodedAudio Metadata: combineEncodedAudio TSSE : Lavf58.42.101 combineEncodedAudio combineEncodedAudio Stream #0:0: Audio: mp3, 22050 Hz, mono, fltp, 48 kb/s combineEncodedAudio Stream mapping: combineEncodedAudio Stream #0:0 -> #0:0 (copy) combineEncodedAudio Press [q] to stop, [?] for help combineEncodedAudio combineEncodedAudio size= 6kB time=00:00:01.01 bitrate= 51.0kbits/s speed=1.8e+03x combineEncodedAudio video:0kB audio:6kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 3.620992% combineEncodedAudio +0ms combineEncodedAudio ffmpeg process completed with code 0 +0ms cleanup Manifest is file 'C:\Users\Tim\AppData\Local\Temp\4b15cfa8-0852-445a-96ea-2823a2a0a301.mp3' +0ms cleanup Deleting temporary file C:\Users\Tim\AppData\Local\Temp\4b15cfa8-0852-445a-96ea-2823a2a0a301.mp3 +1ms cleanup Deleting manifest file C:\Users\Tim\AppData\Local\Temp\f10afdc3-b7f5-4d22-ac47-3fafe240fbb4.txt +5ms moveTempFile copying C:\Users\Tim\AppData\Local\Temp\a30bfdd2-a32c-4533-b1f5-88a5e177813b.mp3 to tests.mp3 +0ms

  1. If copyright allows, please upload your input file somewhere (e.g. pastebin) and put a link to it here.

  2. What OS are you using (Windows, OSX, Linux) and what version?

Windows 10 64bit 18363

  1. What version of Node.js is being used? (Run node -v in the console to find out.)

v12.16.3

  1. What version of ffmpeg is being used? (Run ffmpeg -version in the console to find out.)

git-2020-05-01-39fb1e9

sonodave commented 4 years ago

looks like you need to specify one of the german voice names example in your command string

--voice Hans or --voice Vicki or --voice Marlene

You have to call a specific voice

ThiemoCh commented 4 years ago

That was it! Simple, but effective. Thank you! :)