Open saanvib13 opened 5 months ago
What about reading documentation?
@zdenop Thank you for your response. I tried each and every step mentioned in this documentation. Even then, some decimal points are being omitted such as 22.5 is being misunderstood as 225. Moreover some numbers and being wrongly detected, such as -9 is being extracted as = ). Some negative symbols are also being omitted. I have tried preprocessing the images and have implemented the following:
Pls provide your guidance and help me resolve this issue.
And what did you learn about table recognition? What forum posts about table recognition, what other issues are stated about table recognition? You should check these sources BEFORE posting the issue.
This mod seems to do a slightly better job, still not flawless...
Your Feature Request
I have provided the image from which I am trying to extract text from, using tesseract ocr.
Along with that, I have also provided the result or the extracted text from the image.
As it can be observed from the images, the extracted text is not very accurate. Negative symbols have been omitted, some undesired characters are also there in the extracted text. (I have marked some of the incorrect results with blue boxes) I have tried to improve the results by preprocessing and bringing changes in the parameters of the model. I have tried:
How to improve the detection and extraction of text in tesseract? I have also tried paddleocr for the same task. Even then, symbols such as euro, some negative signs are not being detected.