Closed Giuseppe5 closed 4 years ago
Thank you for the question!
The reason behind this difference in WER is that we add small amount of noise during training (dithering). To get exactly the same predictions during inference, please set dithering gain factor to zero either in code with model_definition['AudioToMelSpectrogramPreprocessor']['dither'] = 0
or in YAML config file: https://github.com/NVIDIA/NeMo/blob/7c3081c4fa94e962507d47d5ec652f62dc10894f/examples/asr/configs/quartznet15x5.yaml#L22
Thanks for the info!
HI @vsl9 I have a similar issue but my performance gets worst after loading the model for inference when I am done finetuning. Is there a reason & solution to make the performance better during inference?
Hello,
I'm currently doing a fine tuning training of Quartznet. At the end of the training process, it computes the WER on the evaluation set. However, if then I try just to load the checkpoints and perform only the evaluation without any training, the WER obtained is slightly different from the one obtained during the training phase.
Is there any reason for this?
Thanks for your help!