For WER=6.71%, it says "a pre-trained model which was trained for 200 epochs". However, in the given configuration file "ds2_large_8gpus_mp", it uses 8 GPUs but "num_epochs": 50.
I tried 50 epochs with 4 GPUs and got WER=8.5%. Wondering if the reference WER 6.71% was from 200 epochs or 50 epochs?
On page: https://nvidia.github.io/OpenSeq2Seq/html/speech-recognition/deepspeech2.html
For WER=6.71%, it says "a pre-trained model which was trained for 200 epochs". However, in the given configuration file "ds2_large_8gpus_mp", it uses 8 GPUs but "num_epochs": 50.
I tried 50 epochs with 4 GPUs and got WER=8.5%. Wondering if the reference WER 6.71% was from 200 epochs or 50 epochs?