Closed kimkwangho82 closed 9 years ago
Thanks for bringing it to my notice. Unfortunately my code does not work for longer sequences. You will need to implement things in the log space. Which is slower. Take a look at this. I think that code works for TIMIT. In the meanwhile I will try to add log ctc.
I added support for log-space. Please see if you have any success with that feature.
When i trained digits i got better results. when i used sentences with more number of frames , i get worst result. the longer sequences are not even aligned to any label.
I think problem still exists...
note: i used the latest code in trunk with logspace.
Hello!
We have a long time (over 300 frames) problem in speech recognition (in TIMIT data).
In general, speech recognition used a feature data with long time, for example 300 frames for 3 second utterance. When we analyzed your code in 'ctc.py' scan function, it seems to be calculated as zero in probabilities variable for over 300 frames. And 'cost' variable showed as 'Inf'.
How can we treat the problem? Do you have any suggestions?
I will wait your comments.
Best regards.