dialogflow / asr-server

FastCGI support for Kaldi ASR
Apache License 2.0
184 stars 86 forks source link

Result always comes back as "YES" #9

Closed matthewmgamble closed 7 years ago

matthewmgamble commented 7 years ago

I followed the directions and it looks like Kaldi and the asr-server are installed correctly, however, whenever I test the API using ether the web interface or uploading a raw file the result is always:

{"status":"ok","data":[{"confidence":0.916982,"text":"YES"}],"interrupted":"endofspeech","time":900}

The actual text is never transcribed. Where would I start debugging this type of issue? I tried increasing the verbosity of the asr-server but that didn't really provide any useful output to me of what the issue could be.

qharlie commented 7 years ago

I have the same problem. Maybe 2 out of 20 times it came back with something besides YES or NO.

The recognize_wav script in the Kaldi distrubution under the api.ai egs recipe works great, but it takes 6 seconds to transcribe a 3 second clip. I've been trying to get that sped up, by compiling with CUDA and MKL, but nothing I have tried has worked.

realill commented 7 years ago

Always getting YES indicates, that there is something wrong. Most likely you are not using 16kHz, 16 bit little-endian format for wav files.

prokopevaleksey commented 7 years ago

Hi, thank you for the awesome project!

I am currently facing the same problem but in my case it seems to appear randomly for different files with the same format. Maybe the length of the file may cause this problem.

Any progress in debugging this issue?

realill commented 7 years ago

Unfortunately I can not help with debugging this. You can try to use vanilla kaldi first using this https://github.com/kaldi-asr/kaldi/tree/master/egs/apiai_decode/s5 . Then if it does not work ask more active kaldi community for help.