Closed lifelongeek closed 8 years ago
I don't think it's a problem with rescaling. The rescaling and non-rescaling versions do the same thing. Can you check if you have empty labels in your training data? That is, the label sequence for an utterance is empty.
Found the reason. When I check labels.tr, there is label which exceeds maximum number of lexicon unit(i.e. 33). I should have checked this file first when I changed the lexicon2.txt. Now training does not make any nan in Obj & Token Acc.
Close the issue.
Hi I am slightly modifying your character based RNN+CTC experiment on swbd. I am trying to use minimal character unit (alphabet(26) + {space ' - } + noise + laugh + vocal-noise) instead of including all the characters such as digits, &, _ . Thus RNN have 34 output units. For this experiment, I had to modify lexicon2.txt & units.txt, and this makes transcription have longer sequence than before. For example) 260 : t w o - s i x t y, 401k : f o u r - o - o n e - k
But, this experiment produce nan for Obj & TokenAcc consistently even if I tried with smaller learning rate & various RNN architecture. I suspect this is because 'train-ctc-parallel' does not rescale alpha, beta during forward-backward algorithm. It seems that non-parallel version use rescaling kernel. (i.e. _compute_ctc_alpha_one_sequence_rescale). But parallel version does use code without rescaling.
Did you have similar experience about nan error? Did I miss rescaling part from your code? Hope I did not make mistake and bother you much.
Here is the a few lines of log example VLOG1 After 20 sequences (0.000913889Hr): Obj(log[Pzx]) = -50.5868 TokenAcc = -nan% VLOG1 After 40 sequences (0.00273056Hr): Obj(log[Pzx]) = nan TokenAcc = -260% VLOG1 After 60 sequences (0.00498056Hr): Obj(log[Pzx]) = -59.5562 TokenAcc = -140.909% VLOG1 After 80 sequences (0.00747778Hr): Obj(log[Pzx]) = -75.5068 TokenAcc = -34.9206% VLOG1 After 100 sequences (0.0101417Hr): Obj(log[Pzx]) = -67.462 TokenAcc = -66.6667% VLOG1 After 120 sequences (0.0129083Hr): Obj(log[Pzx]) = -31.2848 TokenAcc = 11.6279% ... VLOG1 After 740 sequences (0.1354Hr): Obj(log[Pzx]) = 10.6927 TokenAcc = 0% VLOG1 After 760 sequences (0.140061Hr): Obj(log[Pzx]) = 8.33837 TokenAcc = 0% VLOG1 After 780 sequences (0.144731Hr): Obj(log[Pzx]) = 4.95938 TokenAcc = 0% VLOG1 After 800 sequences (0.149453Hr): Obj(log[Pzx]) = 2.56755 TokenAcc = 0% VLOG1 After 820 sequences (0.154206Hr): Obj(log[Pzx]) = 9.55445 TokenAcc = 0%