Closed tayciryahmed closed 4 years ago
The current Joint CTC - Attention model in OpenSeq2Seq was unfinished work, and hence has no documentation in the docs. So unfortunately, we do not have good suggestions for this model.
We have a similar Conv/TDNN -> RNN architecture as part of NeMo which you can find here: garnet.py and corresponding config here: garnet.yaml.
Hopefully these parameters can help guide towards a working implementation inside OpenSeq2Seq.
I am testing the joint ctc attention model with the config available in
example_configs/speech2text/jca_large_8gpus.py
. I have tried a range of learning rates (1e-6 to 1e-2 - small batch size and 1 gpu) but models don't seem to converge. Any recipe on training this architecture?