facebookresearch / gtn

Automatic differentiation with weighted finite-state transducers.
MIT License
453 stars 40 forks source link

Differentiable beam search with GTN-WFST #27

Closed zhwa closed 3 years ago

zhwa commented 3 years ago

Hi,

I noticed that in 2019, the authors published a paper on differentiable beam search decoder. With GTN, is it possible to implement a RNN-CTC-WFST-Beam search decode speech recognizer or handwriting recognizer using PyTorch?

As previously (both in Kaldi and the 2019 paper), extra code in C++/Shell/Python has to be done to implement such an end to end recognizer, it would be great if everything can be done with PyTorch. If that's possible, is there any chance that a full end to end demo/toy example can be provided in the future?

Thanks!

awni commented 3 years ago

Yes, this is exactly one of our goals. We don't have a full implementation of what's in our differentiable decoder ICML paper yet. We have a codebase that implements CTC with frame-level transitions for speech and handwriting recognition using only PyTorch and GTN that we are getting ready to open source. And we also have an ongoing effort to do token-level transitions instead of frame-level transitions.