Closed 943274923 closed 3 years ago
(0) was your training loss (and WER) increasing initially and then jumped up? (1) what training data do you use? (2) batch size of 10 per GPU looks small to me. Can you increase batch size and decrease learning rate?
I train a quartznet model in librispeech ,but Prediction ERROR ,what should I do? I use command:
python -m torch.distributed.launch --nproc_per_node=4 /data/NeMo/examples/asr/quartznet.py --batch_size=10 --num_epochs=400 --lr=0.015 --warmup_steps=8000 --weight_decay=0.001 --train_dataset=data/train_all.json --eval_datasets data/dev_clean.json data/dev_other.json --model_config=/data/NeMo/examples/asr/configs/quartznet15x5.yaml --exp_name=librispeech
train log is: