Open viju2008 opened 6 years ago
Sometimes i get only text as NO
I think you might be seeing the same problem that I posted about in #31
If I switch in a model that I built in January, the recognition is great. With the latest Kaldi I get nothing but [NOISE]
tokens
I posted a question to Kaldi-help
https://groups.google.com/forum/#!topic/kaldi-help/1N4aVb75IdU
but DP did not have any ideas
I found the problem. In order to run with the latest (batchnorm) models you need to add a line after loading
{
bool binary;
kaldi::Input ki(nnet3_rxfilename_, &binary);
trans_model_->Read(ki.Stream(), binary);
nnet_->Read(ki.Stream(), binary);
// This is the crucial line
SetBatchnormTestMode(true, &(nnet_->GetNnet()));
}
Note that this only affects newer models (built using Kaldi source from after about March 2017) For full compatability with the latest Kaldi, these two are probably a good idea as well:
SetDropoutTestMode(true, &(nnet_->GetNnet()));
kaldi::nnet3::CollapseModel(kaldi::nnet3::CollapseModelConfig(), &(nnet_->GetNnet()));
This is shamelessly lifted from (eg) kaldi/src/online2bin/online2-wav-nnet3-latgen-faster.cc
I put some details on this same issue on https://github.com/dialogflow/asr-server/issues/37 for what helped me get over this "issue."
in which file do we add this line SetBatchnormTestMode(true, &(nnet_->GetNnet()));
In Nnet3LatgenFasterDecoder.cc
(in the function Nnet3LatgenFasterDecoder::Initialize
)
@viju2008 I am in the same situation now. did you solve the problem?
See the posts above. The code needed updating to support batchnorm. After this fix everything worked fine. Note however that I haven't used this code in years so it may be broken again.
Sorry. You could try asking in the usual Kaldi help channel
From: hc038 notifications@github.com Reply-To: dialogflow/asr-server reply@reply.github.com Date: Wednesday, November 11, 2020 at 6:49 AM To: dialogflow/asr-server asr-server@noreply.github.com Cc: "Mike Newman (SM)" Mike.Newman@microsoft.com, Mention mention@noreply.github.com Subject: Re: [dialogflow/asr-server] I am not getting any text for decoding (#32)
@mikenewman1https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmikenewman1&data=04%7C01%7CMike.Newman%40microsoft.com%7C1bd19712d8fd4136732608d88637c887%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637406921437794104%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=haF058T30dz%2B%2BpZw%2B6zYcmzh%2FcL1NgTEDmlJgWPPvVc%3D&reserved=0 thanks for the quick reply, I have added that line to Nnet3LatgenFasterDecoder.cc but I am getting this error
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdialogflow%2Fasr-server%2Fissues%2F32%23issuecomment-725378875&data=04%7C01%7CMike.Newman%40microsoft.com%7C1bd19712d8fd4136732608d88637c887%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637406921437814099%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=H386stw5ruT5p9IStk5n7xDFbApqIniHGp5EJ6MBrts%3D&reserved=0, or unsubscribehttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FADDS3TZTLTWARJVHVUK44V3SPJ223ANCNFSM4EIS5VYQ&data=04%7C01%7CMike.Newman%40microsoft.com%7C1bd19712d8fd4136732608d88637c887%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637406921437824091%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=4ktn1JzFYDD1%2Bhd9EaL2EYtkbo8eEntcL8RqjjcdoDM%3D&reserved=0.
I am trying to do with the system mic(Recognition using web browser), does it automatically convert to 16000hz audio format?
Javascript code downsamples browser input to 16000 https://github.com/dialogflow/asr-server/blob/master/asr-html/res/recorderWorker.js#L70
thanks Ilya.
This server is working fine with "curl" command but with "system mic(Recognition using web browser)" I only get this any suggestions?
By the end of the day if curl works you can write your own code to emulate what it does. But without multi-part you wont be able to productionize it very well. Multi-part allows to do "online" decoding where stream is decoded as you speak. So you better figure it out. ;)
I have followed the steps given
However i always get the following output from the asr server
{"status":"ok","data":[{"confidence":0.862751,"text":""}],"interrupted":"endofspeech","time":1080}
Please guide on how to check the asr logs