facebookresearch / gtn

Automatic differentiation with weighted finite-state transducers.
MIT License
453 stars 38 forks source link

Full end to end example for RNN-CTC-WFST #29

Closed babadashagua closed 4 years ago

babadashagua commented 4 years ago

Do you have any plan to open source a complete example of a RNN-CTC-WFST recognizer? Currently most existing examples only cover several parts rather than a full recognizer pipeline.

Thank you!

galv commented 4 years ago

FWIW, I will have one shortly at https://github.com/galv/lingvo-copy, by using openfst and https://github.com/thu-spmi/CAT/ (in that sense, it is way less ambitious than this project). This will most likely form the basis of the MLPerf's speech recognition inference benchmark in v1.0, so you could expect good support, but the MLPerf benchmark reference models won't be finalized until the end of December.

Note that I am not associated with gtn. Just passing by.

awni commented 4 years ago

Do you have any plan to open source a complete example of a RNN-CTC-WFST recognizer? Currently most existing examples only cover several parts rather than a full recognizer pipeline.

We have plans to open source some code which does speech recognition and handwriting recognition using GTN+PyTorch in a separate repository. GTN itself will not be specific to any one application. However, we don't have plans yet to release something nearly as comprehensive as Kaldi or wav2letter (e.g. multiple datasets with SOTA benchmarks).

awni commented 4 years ago

This will most likely form the basis of the MLPerf's speech recognition inference benchmark in v1.0, so you could expect good support, but the MLPerf benchmark reference models won't be finalized until the end of December.

@galv I'd be interested to learn more about your plans here. If I can help out at all or you are interested in support on the GTN side, please let me know, this would be something we would be happy to provide. You can shoot me a note at awni@fb.com.