Closed raotnameh closed 4 years ago
When you train, are the elements in label.json letters or words?
@xieyidi It's a list of all the letters plus the special symbols. As mentioned by Sean Naren in the repo.
I encountered with the same issue @raotnameh. I have a labels.json below:
["_", "h", "d", "a", "0", "g", "8", "n", "k", "x", "v", "r", "p", "o", "j", "c", "9", "i", "5", "4", "ş", "q", "b", "ü", "7", "6", "y", "s", "w", "u", "2", "3", "1", "e", "t", "l", "ç", "ı", "f", "z", "m", "ö", "ğ", " "]
I have a dataset including Turkish special characters and digits. But it does not contain any punctuations. Also after 1 epoch, the loss was turned NaN. And I took inference on my validation set, it returned only empty string and a WER-CER of 100%.
@xieyidi, @SeanNaren do you have an idea why we encounter this issue after one epoch on my own non-English dataset?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
i think it may be caused by the differences of sample-rate between trained data and inference data.
@atifemreyuksel Hi, so I was able to solve it, do this. Train the model on a batch size of "1" and save all the files with the CTC loss. After this remove the files with 'NAN' loss from the training CSV file. Train on the remaining files. The reason for the empty string is, once the loss turns 'NAN' everything goes bananas.
Thank you @raotnameh. Also, I cleaned the data to solve this issue. The probability of getting NaN loss decreased after that.
The second point I changed is using ADAM optimizer instead of SGD. Interestingly, this change enables me to get rid of getting NaN loss and training crash.
@atifemreyuksel yeah, ADAM is preferred instead of vanilla SGD, if the loss isn't stable. Thanks for pointing it out.
Let's make ADAM an option in deepspeech.pytorch, we've also seen stability using it
@raotnameh Hi, I have a question. Did you find any pattern of wav files which caused nan loss problem? I got nan loss error after 10+ epochs with SpecAugment and Tempo/Gain Perturbation.
I'm trying to figure out what case nan loss exactly but so far no clue. My hunch is some augmentation morph a wav file to cause nan loss.
@kouohhashi Hi, I did not manually check the files which were causing the problem. Instead, I just removed the files from training.
But from my experience, lookout for files with shorter duration (e.g., less than 0.5 seconds).
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
I'm training on my own data, the loss decreases while training but at the time of inference, the WER comes to 100% and so does the CER. Also, when I checked the output, it's always an empty string.
Additionally, I did inference using a model trained on librispeech and it works, I get a WER of 54. I'm not sure, why when I use a model trained on my own data it gives an empty string and a WER of 100%?
Any comments on why it's happening?
FYI the dataset is in English.
hello raotnameh, i am using your repo to train an end2end ner from speech model and i get the same issue, can you tell me what the solution please? thanks in advance.
I'm training on my own data, the loss decreases while training but at the time of inference, the WER comes to 100% and so does the CER. Also, when I checked the output, it's always an empty string.
Additionally, I did inference using a model trained on librispeech and it works, I get a WER of 54. I'm not sure, why when I use a model trained on my own data it gives an empty string and a WER of 100%?
Any comments on why it's happening?
FYI the dataset is in English.