dialogflow / asr-server

FastCGI support for Kaldi ASR
Apache License 2.0
184 stars 85 forks source link

simple wav decoding don't work #30

Open by-tech opened 6 years ago

by-tech commented 6 years ago

En_1272-128104-0000.zip installation and compile was success. But when trying a simple wav english file, there's no recognition.

to be sure that file is correct , i used "ffmpeg -i En_1272-128104-0000.wav -f s16le -ar 16000 -ac 1 En_1272-128104-0000.raw"

what i have missing ? thanks to help

RESULT

curl -H "Content-Type: application/octet-stream" --data-binary En_1272-128104-0000.raw http://localhost/asr

{"status":"ok","data":[{"confidence":0.932974,"text":""}]}

VLOG[4] (fcgi-nnet3-decoder[5.2.215~1-5e7d]:FinalizeDecoding():lattice-faster-online-decoder.cc:788) pruned tokens from 4053 to 74
VLOG[4] (fcgi-nnet3-decoder[5.2.215~1-5e7d]:GetRawLattice():lattice-faster-online-decoder.cc:191) init:40 buckets:83 load:0.891566 max:1
VLOG[1] (fcgi-nnet3-decoder[5.2.215~1-5e7d]:DeterminizeLatticePhonePruned():determinize-lattice-pruned.cc:1440) Doing first pass of determinization on phone + word lattices.
VLOG[1] (fcgi-nnet3-decoder[5.2.215~1-5e7d]:DeterminizeLatticePhonePruned():determinize-lattice-pruned.cc:1455) Doing second pass of determinization on word lattices.
VLOG[1] (fcgi-nnet3-decoder[5.2.215~1-5e7d]:Decode():OnlineDecoder.cc:250) Recognized @ 71 ms
VLOG[1] (fcgi-nnet3-decoder[5.2.215~1-5e7d]:Decode():OnlineDecoder.cc:255) Decode subroutine done
viju2008 commented 6 years ago

I am also facing same issue

mikenewman1 commented 6 years ago

Me too #31 What version of Kaldi are you using? Something recent? And what version did you use to build your models? It works for me if I use a model from about a year ago.

bjascob commented 6 years ago

Back in #11 I posted a tar of some test code (see https://github.com/api-ai/asr-server/files/828349/APIAI_Server.tar.gz) that allows you to input xx.raw files directly. I just tried that code again today against Kaldi pulled from Github today and it worked for me (note that to compile you need to remove kaldi-thread.a from src/Makefile). The output for En_1272-128104-0000.raw was...

{"status":"ok","data":[{"confidence":0.897815,"text":"MR QUARTERS THE APOSTLE OF MIDDLE CLASS AND WE'RE GLAD WELCOME HIS GOSPEL"}]}

You might try that code. It's a bit hackerish but if you compile and execute run.sh it'll decode the included test.raw. Be sure to change the location of the model as defined at the top of run.sh.