Joint CTC Attention model not converging

NVIDIA / OpenSeq2Seq

Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP

https://nvidia.github.io/OpenSeq2Seq

Apache License 2.0

1.54k stars 372 forks source link

Joint CTC Attention model not converging #500

Closed tayciryahmed closed 4 years ago

tayciryahmed commented 4 years ago

I am testing the joint ctc attention model with the config available in example_configs/speech2text/jca_large_8gpus.py. I have tried a range of learning rates (1e-6 to 1e-2 - small batch size and 1 gpu) but models don't seem to converge. Any recipe on training this architecture?

blisc commented 4 years ago

The current Joint CTC - Attention model in OpenSeq2Seq was unfinished work, and hence has no documentation in the docs. So unfortunately, we do not have good suggestions for this model.

We have a similar Conv/TDNN -> RNN architecture as part of NeMo which you can find here: garnet.py and corresponding config here: garnet.yaml.

Hopefully these parameters can help guide towards a working implementation inside OpenSeq2Seq.