alumae / kaldi-gstreamer-server

Real-time full-duplex speech recognition server, based on the Kaldi toolkit and the GStreamer framwork.
BSD 2-Clause "Simplified" License
1.07k stars 341 forks source link

obtaining final lattices #64

Open calderma opened 7 years ago

calderma commented 7 years ago

Hello, I was wondering if there is a way to get the full lattice as output with this program; specifically for the purposes of doing KWS later on. Something like the output that is given by something along the lines of:

online2-wav-nnet3-latgen-faster --online=true --frame-subsampling-factor=3 --config=$CONFIG $GRAPH_DIR/HCLG.fst "ark:echo utterance-id1 utterance-id1|" "scp:echo utterance-id1 1.wav |" ark:1.lat

Thank you for your help!

alumae commented 7 years ago

Currently there is no way to obtain lattices and I have no plans to implement this.

calderma commented 7 years ago

That's okay I figured it out. Thank you for your reply though.

rohithkodali commented 7 years ago

hi @calderma are you able to do it?

calderma commented 7 years ago

yes it's only about 5 lines of code to obtain the final lattice after the decoding has finished. I ended up implementing some code that allows keyword searching during the streaming decoding as well so it does the kws on the fly when each "piece" of the lattice is processed.

calderma commented 7 years ago

To clarify, the short code snippet that i used was fairly specific to what I am using it for. I don't have the code in front of me at the moment but I believe it was the "WriteCompactLattice" method in kaldi that i used. I just inserted that into the gst_kaldinnet2onlinedecoder_final_result method in the gstkaldinnet2onlinedecoder.cc file. In order to do it so that it would generalize it would require more coding.

psukys commented 6 years ago

@calderma , if it's still possible, could you share the exact code? I've seen that you have the fork of the gst-kaldi-nnet2-online repository, yet not made any changes into it. Thanks in advance!