synesthesiam / docker-mozillatts

Docker image for Mozilla TTS server
MIT License
181 stars 36 forks source link

Exception when text contains empty lines. #18

Open ronnystandtke opened 3 years ago

ronnystandtke commented 3 years ago

First let me thank you for your work. Absolutely amazing!

I noticed that the TTS system fails when the input text contains empty lines. Here is the error message:

[ERROR] Exception on /api/tts [GET] Traceback (most recent call last): File "/app/lib/python3.7/site-packages/flask/app.py", line 2447, in wsgi_app response = self.full_dispatch_request() File "/app/lib/python3.7/site-packages/flask/app.py", line 1952, in full_dispatch_request rv = self.handle_user_exception(e) File "/app/lib/python3.7/site-packages/flask_cors/extension.py", line 165, in wrapped_function return cors_after_request(app.make_response(f(args, kwargs))) File "/app/lib/python3.7/site-packages/flask/app.py", line 1821, in handle_user_exception reraise(exc_type, exc_value, tb) File "/app/lib/python3.7/site-packages/flask/_compat.py", line 39, in reraise raise value File "/app/lib/python3.7/site-packages/flask/app.py", line 1950, in full_dispatch_request rv = self.dispatch_request() File "/app/lib/python3.7/site-packages/flask/app.py", line 1936, in dispatch_request return self.view_functions[rule.endpoint](req.view_args) File "/app/tts_web/main.py", line 139, in api_tts wav_bytes = text_to_wav(text) File "/app/tts_web/main.py", line 73, in text_to_wav line_wav_bytes = synthesizer.synthesize(line) File "/app/tts_web/synthesize.py", line 324, in synthesize scale_factors=self.scale_factors, File "/app/tts_web/synthesize.py", line 53, in tts do_trim_silence=False, File "/app/lib/python3.7/site-packages/TTS-0.0.6+9e3b052-py3.7-linux-x86_64.egg/TTS/tts/utils/synthesis.py", line 244, in synthesis model, inputs, CONFIG, truncated, speaker_id, style_mel, speaker_embeddings=speaker_embedding) File "/app/lib/python3.7/site-packages/TTS-0.0.6+9e3b052-py3.7-linux-x86_64.egg/TTS/tts/utils/synthesis.py", line 62, in run_model_torch inputs, speaker_ids=speaker_id, speaker_embeddings=speaker_embeddings) File "/app/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 15, in decorate_context return func(args, kwargs) File "/app/lib/python3.7/site-packages/TTS-0.0.6+9e3b052-py3.7-linux-x86_64.egg/TTS/tts/models/tacotron2.py", line 145, in inference encoder_outputs = self.encoder.inference(embedded_inputs) File "/app/lib/python3.7/site-packages/TTS-0.0.6+9e3b052-py3.7-linux-x86_64.egg/TTS/tts/layers/tacotron2.py", line 115, in inference o = layer(o) File "/app/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, *kwargs) File "/app/lib/python3.7/site-packages/TTS-0.0.6+9e3b052-py3.7-linux-x86_64.egg/TTS/tts/layers/tacotron2.py", line 40, in forward o = self.convolution1d(x) File "/app/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(input, kwargs) File "/app/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 257, in forward self.padding, self.dilation, self.groups) RuntimeError: Calculated padded input size per channel: (4). Kernel size: (5). Kernel size can't be greater than actual input size [INFO] 172.17.0.1 - - [07/Mar/2021 15:33:13] "GET /api/tts?text=This%20is%20a%20test.%0A%0AThis%20also. HTTP/1.1" 500 -

synesthesiam commented 3 years ago

I've uploaded a fix in master, but haven't re-built the Docker images yet.