synesthesiam / voice2json

Command-line tools for speech and intent recognition on Linux
MIT License
1.09k stars 63 forks source link

transcribe-stream -a not working from input file / stdin #23

Closed lukifer closed 4 years ago

lukifer commented 4 years ago

Running the following results in a no-op on both 2.0 and latest:

voice2json transcribe-stream -a etc/test/what_time_is_it.wav --wav-sink streamtest.wav --event-sink streamtest.log

The resulting wav-sink is hiccup-y noise, and the event sink is:

{"type": "speech", "time": 0.06}
{"type": "silence", "time": 0.24}
{"type": "speech", "time": 1.4400000000000008}
{"type": "silence", "time": 1.620000000000001}
{"type": "speech", "time": 8.459999999999981}
{"type": "silence", "time": 8.639999999999983}
{"type": "speech", "time": 8.759999999999984}
{"type": "started", "time": 9.059999999999986}
{"type": "silence", "time": 10.439999999999998}
{"type": "stopped", "time": 11.760000000000009}
{"type": "speech", "time": 0.18}
{"type": "started", "time": 0.48}
{"type": "silence", "time": 0.54}
{"type": "speech", "time": 1.0200000000000005}
{"type": "silence", "time": 4.859999999999998}
{"type": "stopped", "time": 5.459999999999994}
{"type": "speech", "time": 0.54}
{"type": "started", "time": 0.8400000000000003}
{"type": "silence", "time": 1.560000000000001}
{"type": "stopped", "time": 3.5400000000000027}
{"type": "speech", "time": 4.56}
{"type": "silence", "time": 4.859999999999998}

Thanks again for your hard work on voice2json! 🙂

synesthesiam commented 4 years ago

You're welcome :)

Everything is working, it's just that the example WAV file has the wrong sample rate (48 Khz). Additionally, there needs to be a bit of silence at the end for the state machine to work.

Here's an example using sox to do both the conversion to 16Khz and the silence padding at the end. Hope this works for you!

$ sox etc/test/what_time_is_it.wav -r 16000 -e signed-integer -c 1 -t raw - pad 0 1 | \
    voice2json transcribe-stream -a - --wav-sink streamtest.wav --event-sink streamtest.log
lukifer commented 4 years ago

Thanks for the silence / state machine context, that's very helpful! Still getting a no-op running that command, on both Mac and RPi, now just echoes Ready.

synesthesiam commented 4 years ago

Ah, I see the problem now. Fixed in 12dea31928ac173d7953a01a9bd5abff502483c5

I'll get this fix pushed into the Docker image and Deb packages soon. Thanks!

lukifer commented 4 years ago

That fixed it, thanks so much!