google / trax

Trax — Deep Learning with Clear Code and Speed
Apache License 2.0
8.09k stars 814 forks source link

Reformer Model for Speech Recognition #439

Open stefan-falk opened 4 years ago

stefan-falk commented 4 years ago

Coming from tensor2tensor I was wondering whether the Reformer model would be also a candidate for speech recognition? Looking at the examples there is none for ASR.

Would it be possible to train an ASR model on the Reformer or would code changes be necessary? If so, can we estimate how much would have to be changed on the model implementation?

Thank you for any insight into this!

lukaszkaiser commented 4 years ago

I believe Reformer (esp. with SRU as feed-forward, which is an hparam already) should make a nice ASR model. Didn't have time to work on it yet, but it'd be great to try!

lukaszkaiser commented 4 years ago

I don't think many changes are needed in terms of the model, but the input pipeline may need some thought. I believe that just feeding bytes could work, but it needs experimentation to see...

stefan-falk commented 4 years ago

It would be interesting to see. I guess translation (or text2text problems in general) could be tested as well?

RegaliaXYZ commented 4 years ago

@stefan-falk Have you managed to do the ASR problem?

stefan-falk commented 4 years ago

@RegaliaXYZ I didn't try it yet, sorry.