simple wav decoding don't work

by-tech commented 6 years ago

En_1272-128104-0000.zip installation and compile was success. But when trying a simple wav english file, there's no recognition.

to be sure that file is correct , i used "ffmpeg -i En_1272-128104-0000.wav -f s16le -ar 16000 -ac 1 En_1272-128104-0000.raw"

what i have missing ? thanks to help

RESULT

curl -H "Content-Type: application/octet-stream" --data-binary En_1272-128104-0000.raw http://localhost/asr

{"status":"ok","data":[{"confidence":0.932974,"text":""}]}

VLOG[4] (fcgi-nnet3-decoder[5.2.215~1-5e7d]:FinalizeDecoding():lattice-faster-online-decoder.cc:788) pruned tokens from 4053 to 74
VLOG[4] (fcgi-nnet3-decoder[5.2.215~1-5e7d]:GetRawLattice():lattice-faster-online-decoder.cc:191) init:40 buckets:83 load:0.891566 max:1
VLOG[1] (fcgi-nnet3-decoder[5.2.215~1-5e7d]:DeterminizeLatticePhonePruned():determinize-lattice-pruned.cc:1440) Doing first pass of determinization on phone + word lattices.
VLOG[1] (fcgi-nnet3-decoder[5.2.215~1-5e7d]:DeterminizeLatticePhonePruned():determinize-lattice-pruned.cc:1455) Doing second pass of determinization on word lattices.
VLOG[1] (fcgi-nnet3-decoder[5.2.215~1-5e7d]:Decode():OnlineDecoder.cc:250) Recognized @ 71 ms
VLOG[1] (fcgi-nnet3-decoder[5.2.215~1-5e7d]:Decode():OnlineDecoder.cc:255) Decode subroutine done

viju2008 commented 6 years ago

I am also facing same issue

mikenewman1 commented 6 years ago

Me too #31 What version of Kaldi are you using? Something recent? And what version did you use to build your models? It works for me if I use a model from about a year ago.

bjascob commented 6 years ago

Back in #11 I posted a tar of some test code (see https://github.com/api-ai/asr-server/files/828349/APIAI_Server.tar.gz) that allows you to input xx.raw files directly. I just tried that code again today against Kaldi pulled from Github today and it worked for me (note that to compile you need to remove kaldi-thread.a from src/Makefile). The output for En_1272-128104-0000.raw was...

{"status":"ok","data":[{"confidence":0.897815,"text":"MR QUARTERS THE APOSTLE OF MIDDLE CLASS AND WE'RE GLAD WELCOME HIS GOSPEL"}]}

You might try that code. It's a bit hackerish but if you compile and execute run.sh it'll decode the included test.raw. Be sure to change the location of the model as defined at the top of run.sh.

dialogflow / asr-server

simple wav decoding don't work #30