junhoyeo / BetterOCR

🔍 Better text detection by combining multiple OCR engines (EasyOCR, Tesseract, and Pororo) with 🧠 LLM.
MIT License
489 stars 27 forks source link

OCR Engine Support: Pororo (`kakaobrain/pororo`, potentially used together with EasyOCR) #2

Closed junhoyeo closed 1 year ago

junhoyeo commented 1 year ago
black7375 commented 1 year ago

To summarize the tweet again, (Currently, X can't see the list of answers without login)

  1. EasyOCR is excellent in text detection.
  2. Pororo is superior to EasyOCR in text recognition.
  3. Pre-processing is another way to increase the recognition rate, and the application method must also vary depending on the characteristics of the OCR engine.
    • For example, below v3.05 of Tesseract is advantageous for dark backgrounds, but after v4.0 that it is advantageous for bright backgrounds.
    • Technologies such as normalization, binarization, and skeletonization may be good in document images, but they are not suitable for photographic images. (Shades of small letters become clumpy and indistinguishable with high probability)
    • One of the few pretreatment that works well with most OCR engines is grayscale.
    • If the size of the ROI(region of interest) is too small, it is better to scale up.
    • I am convinced that the Tesseract's OSD(Orientation and script detection),estimate perspective transformations or dewarping will improve performance. However, it is expected to be difficult to apply only when detecting and appropriate. The easy way is to apply to each ROI.