alphacep / vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Apache License 2.0
7.94k stars 1.1k forks source link

node.js Recognizer with SpeakerModel no X vector #510

Closed damirn closed 3 years ago

damirn commented 3 years ago

It seems that Recognizer with SpeakerModel no longer produces X vector:

const fs = require('fs');
const wav = require('wav');
const { Model, Recognizer, SpeakerModel } = require('vosk');
const { Readable } = require('stream');

const model = new Model('model');
const spkModel = new SpeakerModel('model-spk');
const wfStream = fs.createReadStream('recording.wav', { highWaterMark: 4096 });
const wfReader = new wav.Reader();
const wfReadable = new Readable().wrap(wfReader);

wfReader.on('format', async ({ audioFormat, sampleRate, channels }) => {
  if (audioFormat != 1 || channels != 1) {
      console.error('Audio file must be WAV format mono PCM.');
      process.exit(1);
  }
  const rec = new Recognizer({ model, speakerModel: spkModel, sampleRate });
  for await (const data of wfReadable) {
      const endOfSpeech = await rec.acceptWaveform(data);
      if (endOfSpeech) {
          console.log(await rec.finalResult());
      } else {
          console.log(await rec.partialResult());
      }
  }
  console.log(await rec.finalResult());
  rec.free();
})
wfStream.pipe(wfReader);

returns as final:

{
  result: [ { conf: 0.403675, end: 0.66, start: 0.33, word: 'lol' } ],
  text: 'lol'
}

Am I missing something here?

sadrasabouri commented 3 years ago

Hi @damirn . I've tested your code after this fix by @nshmyrev and this just worked well for me. I wonder if there is a problem with your input file duration. I mean it's not long enough. Check the fallowing example:

I think this is the case. You can try longer files and inform me with the result.

sadrasabouri commented 3 years ago

@nshmyrev Do you mind if I add some example (like python examples) to nodejs section in a PR? My plan is to have same examples on each platform.

nshmyrev commented 3 years ago

My plan is to have same examples on each platform.

Thank you, it would be great!

damirn commented 3 years ago

@sadrasabouri You're right, my wav file was truncated; once I replaced it with a longer one it worked for me too. I don't mind if add this as an example, I was thinking to do it myself.

sadrasabouri commented 3 years ago

@sadrasabouri You're right, my wav file was truncated; once I replaced it with a longer one it worked for me too. I don't mind if add this as an example, I was thinking to do it myself.

Thank you very much. Sure I'll this PR in my fork's nodejs_example branch.

Feel free to pull request me this example (I think test_speaker.js will be a suitable name) there so we can work on it together. Also you can compare python's example for additional features.