rhasspy / piper

A fast, local neural text to speech system
https://rhasspy.github.io/piper-samples/
MIT License
6.44k stars 473 forks source link

Streaming Audio on Windows #240

Closed Sascha353 closed 11 months ago

Sascha353 commented 1 year ago

I try to get the raw PMC audio stream on Windows to work but didn't succeed so far. As far as I can tell VLC should be capable to stream it with this parameters:

echo "This sentence is spoken first. This sentence is synthesized while the first sentence is spoken." | piper --model en_GB-jenny_dioco-medium.onnx --output-raw | "C:\Program Files\VideoLAN\VLC\vlc.exe" --demux=rawaud --rawaud-channels=1 --rawaud-samplerate=22050 --rawaud-fourcc=s16l -

or this without the GUI:

echo "This sentence is spoken first. This sentence is synthesized while the first sentence is spoken." | piper --model en_GB-jenny_dioco-medium.onnx --output-raw | "C:\Program Files\VideoLAN\VLC\vlc.exe" -I dummy --demux=rawaud --rawaud-channels=1 --rawaud-samplerate=22050 --rawaud-fourcc=s16l -

Unfortunately it's not working as expected. I can hear a faint voice during the playback but it's mostly white noise and crackle.

Any advice for VLC or any other option on Windows for audio streaming?

Burnarz commented 1 year ago

Know issue for Windows apparently see [/issues/213]. Just need to wait a new Windows release, or try to compile with the modifcations ^^ If you try, let me know, i'm interrested :D @synesthesiam If you compile a new Windows, i would be glad to beta test it :D

synesthesiam commented 11 months ago

Please try the Windows version of the latest release: https://github.com/rhasspy/piper/releases/tag/2023.11.14-2 This includes (hopefully) a fix for Windows streaming.

Sascha353 commented 11 months ago

It works, great! Thank you!

jame25 commented 11 months ago

Just to add to this topic; sox.exe (Sound eXchange) can also be used to stream audio on Windows.

echo "This sentence is spoken first. This sentence is synthesized while the first sentence is spoken." | piper --model en_GB-jenny_dioco-medium.onnx --output-raw | sox.exe -t raw -b 16 -e signed-integer -r 22050 -c 1 - -t waveaudio pad 0 0.010 >NUL 2>&1