matatonic / openedai-speech

An OpenAI API compatible text to speech server using Coqui AI's xtts_v2 and/or piper tts as the backend.
GNU Affero General Public License v3.0
193 stars 32 forks source link

Piper voices not working. #26

Closed dotslashsuperstar closed 3 weeks ago

dotslashsuperstar commented 3 weeks ago

I'm getting an error when trying to use a custom voice with Piper. I'm using docker and custom/default Coqui voices work and default Piper voices work. The files are downloaded as well. Tried a couple different voices like en_GB alba & southern_english_female. I tried using my legacy nvidia driver (cuda warning raised) and without GPU. Let me know if you need more info.

Heres an excerpt from voice_to_speaker.yaml...

  shimmer:
    model: voices/en_US-libritts_r-medium.onnx
    speaker: 163
  alba:
    model: voices/en_GB-alba-medium.onnx
    speaker: 1 # default speaker
  southern:
    model: voices/en_GB-southern_english_female-low.onnx
    speaker: 1 # default speakeren_US-kristin-medium
  kristin:
    model: voices/en_US-kristin-medium.onnx
    speaker: # default speaker

Heres the log error from docker....

`onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Invalid input name: sid

INFO: 172.18.3.1:46844 - "POST /v1/audio/speech HTTP/1.1" 200 OK

Traceback (most recent call last):

File "/usr/local/bin/piper", line 8, in

sys.exit(main())

         ^^^^^^

File "/usr/local/lib/python3.11/site-packages/piper/main.py", line 126, in main

for audio_bytes in audio_stream:

File "/usr/local/lib/python3.11/site-packages/piper/voice.py", line 123, in synthesize_stream_raw

yield self.synthesize_ids_to_raw(

      ^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "/usr/local/lib/python3.11/site-packages/piper/voice.py", line 166, in synthesize_ids_to_raw

audio = self.session.run(

        ^^^^^^^^^^^^^^^^^

File "/usr/local/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 220, in run

return self._sess.run(output_names, input_feed, run_options)

       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
matatonic commented 3 weeks ago

Hrm... Confirmed, I'm seeing the same behavior. Thanks for the report!

matatonic commented 3 weeks ago

For a model without multiple speakers (like alba) the entry should be like this:

  alba:
    model: voices/en_GB-alba-medium.onnx
    speaker:

not speaker: 1

When using a speaker with only a default voice, leave speaker blank.

dotslashsuperstar commented 3 weeks ago

Ahh, it works. Sorry about that. I thought I did that first but it works.