The model isn't loading

ccoreilly / vosk-browser

A speech recognition library running in the browser thanks to a WebAssembly build of Vosk

Apache License 2.0

364 stars 60 forks source link

The model isn't loading #1

Closed RezowanTalukder closed 3 years ago

RezowanTalukder commented 3 years ago

I see there is an model.tar.gz in public folder in react example. I want to use it for testing purposes, but the model isn't loading.

ccoreilly commented 3 years ago

Hi!

I guess you cloned the repository and have started the react example with npm start? The react example expects the models to be in public/models but they have not been pushed to the repository in the master branch. The model.tar.gz is not used by the example (I should remove it). You can find all the models in the gh-pages branch, just place them in public/models and it should work.

Here are the models: https://github.com/ccoreilly/vosk-browser/tree/gh-pages/models

RezowanTalukder commented 3 years ago

Thank you so much. its working.

but while in file upload its throwing an error line

Unhandled Rejection (InvalidStateError): Failed to execute 'createMediaElementSource' on 'AudioContext': HTMLMediaElement already connected previously to a different MediaElementSourceNode.

can you please mention which type of audio files it expects ?

ccoreilly commented 3 years ago

Yes, this is an issue that only happens in chrome with the demo (it has nothing to do with the vosk-browser library). On Firefox it does not complain. I started working on a fix but I haven't fully tested it, I'll try to push it later today.

RezowanTalukder commented 3 years ago

Yes its working fine in Firefox but the problem is my model has pretty good accuracy for audio directly through mic but when I upload audio file(.mp3 or .flac) of same transcripts it can't recognize the word properly. where is the issue can you guess anything ?

ccoreilly commented 3 years ago

I would assume loss of quality during conversion? The library does no conversion, the trick is that the demo reads the audio stream being played by the browser in a hidden audio tag.

Have you tried with a 16kHz wav (which is what most models expect)?

RezowanTalukder commented 3 years ago

I just check with 16khz .flac and .wav audio, it can't recognise words properly.

ccoreilly commented 3 years ago

Hi @RezowanTalukder !

Just to understand it better, you have issues with file transcription when using another English model than the one used in the demo, am I right? Could you maybe share the model? Is it also a wideband (16kHz) model?

RezowanTalukder commented 3 years ago

...