alphacep / vosk-server

WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries
Apache License 2.0
918 stars 248 forks source link

Server does not work with Portuguese languague #72

Closed ibombonato closed 4 years ago

ibombonato commented 4 years ago

I notice that the docker builds and docker-hub does not include portuguese language. So i tried to compile/run with portuguese model and server always return empty transcriptions.

Here what I did:

Created a dockerfile for PT

filename: Dockerfile.kaldi-pt

FROM alphacep/kaldi-vosk-server:latest

ENV MODEL_VERSION 0.3
RUN mkdir /opt/vosk-model-pt \
   && cd /opt/vosk-model-pt \
   && wget -q https://alphacephei.com/vosk/models/vosk-model-small-pt-${MODEL_VERSION}.zip \
   && unzip vosk-model-small-pt-${MODEL_VERSION}.zip \
   && mv vosk-model-small-pt-${MODEL_VERSION} model \
   && rm -rf vosk-model-small-pt-${MODEL_VERSION}.zip

EXPOSE 2700
WORKDIR /opt/vosk-server/websocket
CMD [ "python3", "./asr_server.py", "/opt/vosk-model-pt/model" ]

Then I build and run it...

But when I call

asr-test.py "my_portuguese_file.wav" it return empty string for the hole process...

Like:

{
  "partial" : ""
}
{
  "text" : ""
}

Are there any know bug for portuguese language?

Fwiw, I did the same process with Spanish language and everything went fine...

Tks!

nshmyrev commented 4 years ago

Small portuguese model is 16khz, you need to add VOSK_SAMPLE_RATE 16000 and send 16khz data to the server:

https://github.com/alphacep/vosk-server/blob/fb9b937da41a4625fdba98d1621ed29ea72b2b79/docker/Dockerfile.kaldi-fr#L25

the telecom model for Portuguese is not public yet.

ibombonato commented 4 years ago

It worked after changing the dockerfile and also after I convert the audio with ffmpeg

ffmpeg -i test_file.wav -loglevel quiet -ar 16000 -ac 1 -f s16le test_file_16kz.wav

Tks