Open mingchen62 opened 6 years ago
Hmm which dataset are you using? I haven't observed that repetition problem in im2text before, but repetition has been a well known problem in other seq2seq problems like summarization, and people usually solve that with coverage penalty to avoid attending to the same source word too much.
thanks. I am using a handwriting formula data set. I guess, the variety of distance between handwriting symbols contributes to the repetition problem. I also look at opennmt-py for coverage penalty.
i.e. https://github.com/OpenNMT/OpenNMT-py/issues/340.
Will report if I have any luck in that try.
Tried a few combination of length and coverage parameters. some worse and some minor improvement. May need more hyper parameter exploration. For example, https://arxiv.org/pdf/1703.03906.pdf
after training,how about your bleu value? I didn't do anything,accuracy increased by 3%.
After training, I started to evaluate and I found the prediction interesting. The trained model did good prediction on some more complicate Latex such as fraction or sqrt, it failed on some simpler formula. For example, ground truth is "y=x^+2x +1" but the prediction is "y=x^2+2x +2x + 1". ground truth is "270" but the prediction is "2700". The decoder duplicates last symbol(s). Any hint on how to tune the model to alleviate the issue?
My training results looks reasonable: Epoch: 11 Step 43142 - Val Accuracy = 0.923066 Perp = 1.137150 Epoch: 12 Step 47064 - Val Accuracy = nan Perp = 1.138024