rsommerfeld / trocr

Powerful handwritten text recognition. A simple-to-use, unofficial implementation of the paper "TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models".
MIT License
181 stars 27 forks source link

Accuracy goes to 0.0 frequently #3

Closed Rysess closed 2 years ago

Rysess commented 2 years ago

Hi, i have problem with the training of the model. Indeed the gradient seems to explode frequently but not at every training. Here is a graph that represents this problem.

MicrosoftTeams-image

I've tried to print the prediction of the model at each validation step but when the gradient explode the model keeps predicting empty labels. I'm using a portion of the IAM dataset and my labels are structured this way : file-name.png,¤label¤ I'm using the character '¤' since it does not appear in the dataset and so i can predict double quotes (I've modified the csv reader to take this character to mark out the label). I've tried to force the download of the pretrained weights at the beginning of each training without effect. I've also tried to increase the word len without any effect too. I'm surely missing something but can't see what.

Do you have any idea what could cause the model to run this way ? Thanks

akiradavid27 commented 2 years ago

Hi, how did you generate those graphs?

Rysess commented 2 years ago

Hi, I generated those graphs by parsing the output of the training. For the gradient norm i followed this topic : https://discuss.pytorch.org/t/check-the-norm-of-gradients/27961/5 and simply add it to the debug print. I tried without in case it caused instability but the same problem appears.

rsommerfeld commented 2 years ago

Hi bgaro, the pretrained weights are cached locally after downloading them for the first time. The fact that you don't see a downloading bar does not mean the weights are not applied at the beginning of the training. It should load, as long as VisionEncoderDecoderModel.from_pretrained(paths.trocr_repo) in util.py is executed.

Now regarding your training issue:

Let me know if that helps!

Rysess commented 2 years ago

Hi, i managed to solve the issue thanks to your help. To answer your questions :

TLDR : Reducing the LR to 5e-6 did solve the issue