junhoyeo / BetterOCR

🔍 Better text detection by combining multiple OCR engines (EasyOCR, Tesseract, and Pororo) with 🧠 LLM.
MIT License
465 stars 25 forks source link

OCR Engine Support: Pororo (`kakaobrain/pororo`, potentially used together with EasyOCR) #2

Closed junhoyeo closed 10 months ago

junhoyeo commented 10 months ago
black7375 commented 10 months ago

To summarize the tweet again, (Currently, X can't see the list of answers without login)

  1. EasyOCR is excellent in text detection.
  2. Pororo is superior to EasyOCR in text recognition.
  3. Pre-processing is another way to increase the recognition rate, and the application method must also vary depending on the characteristics of the OCR engine.
    • For example, below v3.05 of Tesseract is advantageous for dark backgrounds, but after v4.0 that it is advantageous for bright backgrounds.
    • Technologies such as normalization, binarization, and skeletonization may be good in document images, but they are not suitable for photographic images. (Shades of small letters become clumpy and indistinguishable with high probability)
    • One of the few pretreatment that works well with most OCR engines is grayscale.
    • If the size of the ROI(region of interest) is too small, it is better to scale up.
    • I am convinced that the Tesseract's OSD(Orientation and script detection),estimate perspective transformations or dewarping will improve performance. However, it is expected to be difficult to apply only when detecting and appropriate. The easy way is to apply to each ROI.