microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
https://aka.ms/GeneralAI
MIT License
19.08k stars 2.43k forks source link

TrOCR small usage for license plate ocr #1550

Open SwEngine opened 1 month ago

SwEngine commented 1 month ago

Hi, I want to train trocr-small-printed for license plate ocr for my school work. However, when I use trocr model from huggingface, the decoded outputs are garbage English words and not meaningful for license plates. How should I define VisionEncoderDecoderModel and TrOCRProcessor? How should I change vocabulary etc? And what type of encoder-decoder should I use? Thanks in advance!

rohit5895 commented 1 month ago

For license plate ocr, I would recommend using base-printed and colab for training.

https://github.com/NielsRogge/Transformers-Tutorials/blob/master/TrOCR/Fine_tune_TrOCR_on_IAM_Handwriting_Database_using_native_PyTorch.ipynb

SwEngine commented 1 month ago

I am already using the notebook of NielsRogge, however outputs are not license plate characters. Decoded outputs are english words and do not match with plate characters. In addition, I am also using base-printed pretrained model. Vocabulary or something is wrong.

rohit5895 commented 1 month ago

Can you share your code?

SwEngine commented 3 weeks ago

When I use trocr-small-printed as processor and trocr-small-printed as model, outputs are sequence of characters as expected. However, when I use trocr-base-printed as processor and trocr-base-printed as model, outputs are not sequence of characters, outputs come as sequence of words. What can be the problem? Code is same with NielsRogge's code. @rohit5895 @NielsRogge

Example: Using trocr-small-printed: _Label: 331203_ASD Predict: 331203ASD Using trocr-base-printed: _Label: 331203ASD Predict: memory Strengthig French previousinterest build

Note: I am printing "pred_str[0]" in the "compute_cer" function.

SwEngine commented 1 week ago

Any idea?