How to improve OCR accuracy for Japanese characters?

clovaai / donut

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

MIT License

5.52k stars 443 forks source link

During some experiments, I noticed that sometimes Japanese characters are not correctly recognized. Not necessarily very complex characters, but simple and commonplace characters such as 津 gets recognized as 活 etc.

My understanding is that it is related to the pre-training task but I'm not sure how to solve the issue. Just for ideas:

Is it possible to additionally pre-train the "donut-base" model for improved OCR accuracy?
Is it possible to swap the Swin Transformer for something else that provide better results?

If anyone has any ideas/hints/suggestions related to OCR accuracy, it will be very much appreciated!

clovaai / donut

How to improve OCR accuracy for Japanese characters? #305