Open BlackHawkCH91 opened 2 years ago
Your audio is stereo, you also need to convert it to mono
Yep that was the issue, though it seems that it's quite inaccurate, now outputting: "the works should be within the rules" instead of "flexible rubber soles". Might need to experiment with the models.
Thanks for answering!
EDIT:
yeah, the accuracy depends on the model. Changed it to a different one and it was more accurate.
Yep that was the issue, though it seems that it's quite inaccurate, now outputting: "the works should be within the rules" instead of "flexible rubber soles".
Sample rate is also wrong, the file has 22khz,not 16. If you fix everything it should output "flexible rubber soles" like with vosk-transcriber command line utility
Even with a small model
Yep, it's a lot more accurate. Thanks for the help. I'm now getting the sample rate of the file and then passing it to the VoskRecogniser.
This is my first time using Vosk, so please bear with me. I'm using .NET Core 6 with C# 10, Vosk 0.3.38 and the "vosk-model-en-us-0.22-lgraph" model (renamed the folder to "model"). The model appears to be loading fine, with no errors or warnings showing up.
The attached audio file says: static flexible rubber soles static
However, Vosk always outputs:
The output is the same even when using different audio files. I have also tried using the "vosk-model-small-en-us-0.15" model, but the output was mostly the same.
Here is the code for speech recognition:
To use the audio file, change the file extension from .mp4 to .mp3.
https://user-images.githubusercontent.com/49353890/179464547-3ee3a00e-2941-4443-b2a2-14165b21354f.mp4
Vosk console/debug messages.
output.txt