Open likestudy opened 5 years ago
I think that since you are getting output you are using the model correctly. The issue here is that the training hasn't converged. CTC is usually tricky to optimise so the training might not be as smooth as with other loss functions. Unfortunately, I cannot tell if the MFCC features are correct but if you followed the instructions they are most likely fine.
With that said I think you should try the following:
Make sure that your labels are correctly assigned during the training.
Once the training loss stops reducing for a few epochs (10-20) try interrupting the training and restarting from the checkpoint.
Play around with the learning rate and the clipping value.
Increase the early stopping patience.
Hello everyone: I am trying to do this experiment as the REAME said, but during training the audio network, the loss of this model is around 150¡£As I tired to use this model to predict something, I can get nothing but 'sil'. Is there any problems with the features that I extracted, or I used this model in a wrong way? But the features that I use are just the 39 MFCCs that extracted by the HCopy.
Has anyone ever meet this problem? Or can anyone tell me the result of your models? I'm so confused, but I can't figure out the problem.
Here I attach a file's features that I extracted. Thanks very much . audio_1.csv.zip