harvardnlp / im2markup

Neural model for converting Image-to-Markup (by Yuntian Deng yuntiandeng.com)
https://im2markup.yuntiandeng.com
MIT License
1.21k stars 214 forks source link

Getting low accuracy using customized images for test. #48

Closed PeihanDou closed 3 years ago

PeihanDou commented 3 years ago

Hello Authors:

We modified and trained your model on our PCs and got pretty high BLEU accuracy on test dataset. We use Transformer instead of RNN or LSTM. But When we try to use the trained model to predict some local images (for example, screenshot of a latex formula), the result is not so good. We did some data augmentation such as random downsample ratio or random Gaussian blur. But the test on local images still gets low accuracy. Would you share any thoughts about that? I would be very appreciated if you could give us any advice. Thanks!

da03 commented 3 years ago

Hi Peihan,

I think it's likely due to the mismatch between training and test data. Neural networks are extremely sensitive to out-of-domain noise, and even with data augmentation, the screenshoots might be still very different from what the model has seen during training.

PeihanDou commented 3 years ago

Thank you for your advice!