Open jrichardsz opened 5 months ago
demo Recognizer is initialized with 16khz sample rate, your file is 48khz, you probably need to check the line where you create the recognizer if it has proper sample rate.
Thanks for your help.
I will review the sample rate and share the result.
I changed the initial sample rate from 16000 to 48000
var sampleRate = 48000
var rec;
initRecognizer=()=>{
if(typeof rec !== 'undefined' ) return;
vosk.setLogLevel(0);
const model = new vosk.Model(this.configuration.vosk_model_path);
rec = new vosk.Recognizer({model: model, sampleRate: sampleRate});
}
But now, rec.acceptWaveform(message)
returned false. Just to try I returned to 16000 and at least rec.acceptWaveform(message)
returns true but with empty result
Thanks for your help
Continue with attempts, if the wave buffer comes directly from microphone, it works
var mic = require("mic");
var micInstance = mic({
rate: String(SAMPLE_RATE),
channels: '1',
debug: false,
device: 'default',
});
micInputStream.on('data', async (buffer) => {
if (rec.acceptWaveform(buffer)){
But if comes from socket, does not work.
Using this https://alanastorm.com/nodejs-inspecting-bytes-with-node-js-buffer-objects/ I'm trying to compare both buffers byte by byte t understand what is the difference.
In this csv I saved the compare: compare_bytes.csv
At first sight, the bytes from socket has a lot 00000
Consider that the wav buffer from socket can be stored as valid wav (pcm, 16bits, etc):
@SocketIoEvent(eventName = "receive-audio")
this.receiveAudio = async (message, currentSocket, globalSocket) => {
var id = uuidv4();
var wavLocation = `/tmp/${id}.wav`;
await fs.promises.writeFile(wavLocation, message);
But the buffer from microphone, can not be saved as wav. Also if I read it using wavefile I got an error "Error: Not a supported format."
What part of wav is vosk (nodejs) expecting?
Thanks
What part of wav is vosk (nodejs) expecting?
Only body
According to the wave format, data is from 38 to 45 order
I tried but the rec.acceptWaveform(data) returns false
var data = message.slice(38,45);
rec.acceptWaveform(data)
Could you point me to some lectures to understand how to extract data from wav?
Thanks
You should keep message as is, your slice doesn't make sense. Header is 44 bytes of first message only, and you can even keep it.
Ok. I will keep the full wav file.
Comparing the wav from socket (does not works) vs from microphone(works in the sample), I found this:
@SocketIoEvent(eventName = "send-audio")
this.sendAudio = async (message, currentSocket, globalSocket) => {
console.log("0 > 4 : "+message.slice(0,4).toString())
console.log("8 > 12 : "+message.slice(8,12).toString())
console.log("12 > 14 : "+message.slice(12,14).toString())
console.log("36 > 40 : "+message.slice(36,40).toString())
console.log("45 > end:", message.slice(45))
Output:
The output indicates that the buffer received as socket event is a valid wav file
micInputStream.on('data', async (data) => {
if (rec.acceptWaveform(data)){
console.log("0 > 4 : "+data.slice(0,4).toString())
console.log("8 > 12 : "+data.slice(8,12).toString())
console.log("12 > 14 : "+data.slice(12,14).toString())
console.log("36 > 40 : "+data.slice(36,40).toString())
console.log("45 > end:", data.slice(45))
console.log("data:"+JSON.stringify(rec.result()));
Output
The received buffer from microphone (https://www.npmjs.com/package/mic) is not a valid wav file but works for vosk
I don't know if it helps, but the object returned by the microphone (vosk sample) and the received from socket are Uint8Array
Dump the data you receive both from microphone and socket to a file and share here please
I will dump the data. In the mid time I prepared a reproducible sample
https://github.com/jrichardsz/nodejs-wav-vosk-transcription
As a summary:
Thank you very much for your kind help
I tried the same but with other library and it works
Expected Behavior
Receive the wav file stream and get the transcription
Current Behavior
rec.result() is returning an empty text
{"text":""}
Steps to Reproduce
<Buffer@0x6260d30 52 49 46 46 24 da 0e 00 57 41 5 ...
and pass it to the vosk recognition instanceContext (Environment)
Additional information
I uploaded the recorded wav file recorded here: https://filebin.net/e0dt62s5u0rg3ltj
Wav file details are
with browser
With ffmpeg
With https://www.npmjs.com/package/wavefile