naptha / tesseract.js

Pure Javascript OCR for more than 100 Languages 📖🎉🖥
http://tesseract.projectnaptha.com/
Apache License 2.0
35.31k stars 2.23k forks source link

Broken character not recognizable #955

Closed wazidchoudhary closed 2 months ago

wazidchoudhary commented 2 months ago

Discussed in https://github.com/naptha/tesseract.js/discussions/954

Originally posted by **wazidchoudhary** September 17, 2024 Hey guyz can someone help on extract text from ocr images of broken character like ![downloaded_image - 2024-09-17T000722 019](https://github.com/user-attachments/assets/5ed03ac5-9c38-435a-9e72-d3706f234c0d) ![downloaded_image - 2024-09-17T003654 506](https://github.com/user-attachments/assets/4ce1f2b4-2ab2-4bc5-8933-193e88df8fda)
wazidchoudhary commented 2 months ago

please help I tried everything it read p as F and mismatch many characters

Balearica commented 2 months ago

This was already posted as discussion #954, which is the appropriate place to post it, as it is neither a bug report nor feature request. Please do not open duplicative threads.