alumae / gst-kaldi-nnet2-online

GStreamer plugin around Kaldi's online neural network decoder
Apache License 2.0
185 stars 100 forks source link

Questions regarding nnet2 decoding on gstreamer server #68

Closed feddyfedfed closed 5 years ago

feddyfedfed commented 6 years ago

If the nnet3 decoding implementation is based on online2bin/online2-wav-nnet3-latgen-faster.cc, is it also the case that the nnet2 implementation is based on online2-wav-nnet2-latgen-faster? How does the server performance using nnet2 compare with nnet-latgen-faster?

I've been evaluating our models using the server and online2-wav-nnet2-latgen-faster but I found that I can't get almost similar performance between the two. My basis for comparison are utterances that are perfectly decoded by competing systems, perfectly decoded by online2-wav-nnet2-latgen-faster, but consistently erroneous with the server. I understand that there will be discrepancies and variations in performance as caused by dithering, but is it also possible that the server is doing something to the speech data that is different when we do offline decoding?

alumae commented 6 years ago

Yes, nnet2 implementation is based on online2-wav-nnet2-latgen-faster and they should provide similar performance. If there are large differences, it could be because of some kind of a bug. If you could provide a model and the files, and show the differences, we could try to debug it.