Closed niucheney closed 7 years ago
You can simply look at the output of net-output-extract, which is a matrix of number of frames by number of tokens. You can see it being passed to the decoder (which applies the language model) in decode_ctc_lat.sh (or other decoding scripts) as a pipe, where you can store it or otherwise process it (convert to text from using copy-feats or so).
Look at ErrorRateMSeq in https://github.com/srvk/eesen/blob/blank_scale_and_parallel_models/src/net/ctc-loss.cc, you can call train-ctc-parallel in validation mode with the "--sequence-out-file" option, which will give you a file that contains the IDs, temporal locations, and confidences of the peaks. This is what you want, right?
We'll integrate this into the main branch soon (hopefully).
@fmetze Thank you very much for the prompt reply. The function of the option is very cool !
Hi, I just want to know which frame the CTC tookit (EESEN) chooses when decoding. Can EESEN output it? I am looking forward to your reply. Best wishes!