loss equal to zero - Githubissues

Ihebzayen commented 2 years ago

hello , when i tried to train the end to end model with labels.json i got this loss (Loss 0.00000) for all along the first epoch and even on the second epoch : Validation Summary Epoch: [1] Average WER 99.996 Average CER 99.292
Learning rate annealed to: 0.004636 Found better validated model, saving to models/deepspeech_final.pth Shuffling batches... WARNING: received an inf loss Skipping grad update Epoch: [2][1/32285] Time 1.238 (0.664) Data 0.161 (0.002) Loss 0.0000 (10.3948)

can someone help me out please ?

raotnameh commented 2 years ago

Hi @Ihebzayen. Can you please share what is the batch size and learning rate you are using?

Ihebzayen commented 2 years ago

hello , i used the default ones in the repos but with batch size equal to 8 and train on single gpu "python train.py --train-manifest data/ner/train.csv --val-manifest data/ner/dev.csv --cuda --rnn-type gru --hidden-layers 5 --momentum 0.95 --weights models/without_space.pth --opt-level O0 --loss-scale 1.0 --hidden-size 1024 --epochs 50 --lr 0.0051 --gpu-rank 0 --batch-size 8 --labels labels.json"

raotnameh / End-to-end-E2E-Named-Entity-Recognition-from-English-Speech

loss equal to zero #13