Open arthur-compton opened 2 years ago
Just tried this repo, and agreed. It seems either the wrong config file was uploaded or there's a regression in the repo (or Tensorflow). Any tips on what's happening here would be greatly appreciated.
Same Issue here. Adam with warmstep-40000 didn't learn anything. Can we @usimarit take a look at the code?
I've been running the examples in the "conformer/" and in the "rnn_transducer/" directories and comparing the models with those already provided on drive.
The conformer training works as expected, and the results of the model I trained are almost identical to the results obtained with the pretrained model (I am using the three librispeech training sets for training, 960 hours).
The training in the rnn_transducer example, however, doesn't really converge to anything usable. I've tried with the configuration in the codebase and with the slightly different configuration in drive. In both cases the loss reduces just a little bit during training but certainly too little, so that the final model has not learnt much.
My guess is that there is something broken in the rnn_transducer example. Has anyone tried it out with a recent version of the code? I've tried version 1.0.3 (TF2.6), 1.0.1, and 1.0.0 (TF2.4.1): in all cases the training doesn't really converge.
Any suggestion is very much appreciated!