siyuan-note / siyuan

A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.
https://b3log.org/siyuan
GNU Affero General Public License v3.0
17.08k stars 1.27k forks source link

OCR engine:substitute PaddleOCR for Tesseract-OCR #10232

Open pureTrue opened 5 months ago

pureTrue commented 5 months ago

In what scenarios do you need this feature?

After configuring siyuan's OCR, I felt that the recognition rate was low. Later, then I switched to software that invoke the paddleOCR API and found that both English and Chinese had better recognition rates. I hope siyuan can replace the original OCR engine.

Describe the optimal solution

PaddleOCR has better text recognition capabilities than Tesseract.

Quote:

Recently PaddleOCR updated the v3 version, and the English space problem has been significantly improved. I tried the English model, it works very well.

In document scenarios, PaddleOCR can achieve 95%+ accuracy. But Tesseract may be confused on some rhythmic characters.

In particular, PaddleOCR's performance in some non-Latin languages ​​is beyond my imagination. For example Arabic, the effect is far better than EasyOCR and Tesseract

Highly recommend PaddleOCR!!!


Paddle OCR is a deep learning-based OCR system created by PaddlePaddle, a Chinese AI firm. Paddle OCR is built on the PaddlePaddle framework, which is well-known for its quick and efficient deep learning algorithms. Paddle OCR supports numerous languages, including Chinese, English, Japanese, and Korean, and can properly detect different text styles and fonts. Advantages: High accuracy: Paddle OCR has achieved state-of-the-art performance on various OCR benchmarks, including the ICDAR 2015 and ICDAR 2017 competitions.Fast and efficient: Paddle OCR is optimized for speed and can process large volumes of images in real-time, making it suitable for applications that require high throughput.Easy to use: Paddle OCR has a user-friendly interface that allows users to quickly train and deploy OCR models.

Reference:

Describe the candidate solution

pls

Other information

.

Aiviokoo commented 5 months ago

借楼,建议 mac 端可以直接调用 Apple 自带的 OCR 功能

Achuan-2 commented 2 months ago

+1 不知道这个能不能用上:

ps:公式识别可以用https://github.com/breezedeus/Pix2Text