Open iou2much opened 4 years ago
The logit operation (addition/ assignment/ for loop) for beam_search_decoder and wfst_decoder are written with python. If you want to perform beam search with C++, there are two ways: 1 You need to create pbs that capture the network operations (encoder feature extraction, decoder step with encoder and previous states and inputs) and stitch them with C++ logits operations. 2 You can write the logit operation with tensorflow ops and freeze the whole graph to one pb. I believe the second option has already been implemented in MWER training of Speech Transformer.
implemented in MWER training of Speech Transformer
Really? That's great. Let me check it out. Thank you
Hi, @Some-random and @hoyden . I've read the BatchBeamSearchLayer module in branch mwer. Yet I still get some questions, could you help to interpret more here? In BatchBeamSearchLayer, there's no scorer like CTCScorer, lm_model scorer. Do I need them in training stage or decoding stage? won't it help the performance?
Hi, @Some-random and @hoyden . I've read the BatchBeamSearchLayer module in branch mwer. Yet I still get some questions, could you help to interpret more here? In BatchBeamSearchLayer, there's no scorer like CTCScorer, lm_model scorer. Do I need them in training stage or decoding stage? won't it help the performance?
BatchBeamSearchLayer is used in training stage, CTCScorer and lm_model scorer is not used in this stage. For decoding stage, adding these scorers will obviously boost the performance, but we haven't provided deployment with language model and CTC joint decoding yet
In my understanding, after export pb file and use CPP demo to transcribe. It doesn't use beam_search_decoder or wfst_decoder, it just output the transformer decoder result straightly. am I right?
If so, could anyone give some guidance for using beam_search or wfst in deployment mode? Thanks a lot