sirfz / tesserocr

A Python wrapper for the tesseract-ocr API
MIT License
2.02k stars 254 forks source link

How to detect bold or italic text. #236

Closed Kunal2341 closed 4 years ago

Kunal2341 commented 4 years ago

So I have a document full of text. Is there any way that I can detect bold or special kinds of text. Maybe detect colors, or size of the text. Could someone help me with how to do this? Anything will help. Thanks

sirfz commented 4 years ago

You're better off asking on StackOverflow as this is a tesseract api-related question. Try reading tesseract's documentation for a method to extract word attributes (such as bold, color, etc) and you should be able to call them via tesserocr (if not then feel free to open an issue).