rhasspy / piper

A fast, local neural text to speech system
https://rhasspy.github.io/piper-samples/
MIT License
6.71k stars 492 forks source link

Garbled / Static-y output on windows only when output_file is stdout #607

Open RickeyWard opened 2 months ago

RickeyWard commented 2 months ago

on windows with the latest release version:

Outputting a wav file directly sounds great!

echo "hello, this is a test" | ./piper.exe --model en_US-amy-medium.onnx --output_file hello.wav

Outputting raw and then playing it with ffmpeg, sounds great!

echo "hello, this is a test" | ./piper.exe --model en_US-amy-medium.onnx --output_raw -o - |  ffplay -nodisp -autoexit -f s16le -ar 22050 -ac 1 -

but if you use the stdoutput for wav file it is recognizable but extremely static-y. The riff headers are fine the audio is just busted.

echo "hello, this is a test" | ./piper.exe --model en_US-amy-medium.onnx --output_file - > hello.wav

trying this with ffplay yield the same aweful result

echo "hello, this is a test"| ./piper --model ./en_US-amy-medium.onnx --output_file - | ffplay -nodisp -autoexit -

I thought maybe this was an issue with the shell doing something weird, but piping raw into a file and playing it with ffplay sounds great but piping the file output into a file sounds bad.

I wanted to attach a sample but github doesn't allow it.