clovaai / donut

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
https://arxiv.org/abs/2111.15664
MIT License
5.52k stars 443 forks source link

How to improve OCR accuracy for Japanese characters? #305

Open kirby707 opened 1 month ago

kirby707 commented 1 month ago

During some experiments, I noticed that sometimes Japanese characters are not correctly recognized. Not necessarily very complex characters, but simple and commonplace characters such as 津 gets recognized as 活 etc.

My understanding is that it is related to the pre-training task but I'm not sure how to solve the issue. Just for ideas:

  1. Is it possible to additionally pre-train the "donut-base" model for improved OCR accuracy?
  2. Is it possible to swap the Swin Transformer for something else that provide better results?

If anyone has any ideas/hints/suggestions related to OCR accuracy, it will be very much appreciated!

bliujunyuan commented 1 day ago

Maybe the pre-train model Japanese data distribution infected by other language dataset.The additionally training may work.