JaidedAI / EasyOCR

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
https://www.jaided.ai
Apache License 2.0
24.42k stars 3.16k forks source link

Detecting individual words #497

Closed shahurvi84 closed 3 years ago

shahurvi84 commented 3 years ago

Hi, Thanks for the code. I followed all instructions and was able to execute code. However, I wanted to check if we can extract individual words instead of continuous words. e.g. from example that is provided, the text "Reduce your risk of coronavirus infection" was extracted, but can we get something like "Reduce", "your", "risk", "of", "coronavirus", "infection".

Thanks in advance

amarv3142 commented 3 years ago

@shahurvi84 You can try adjusting parameter width_ths of readtext function which decides whether to merge two adjacent bounding boxes based on distance between them. width_ths=0 might get you the desired result in your case. Incase, you are using paragraph=True, then you need to adjust x_ths.

P.S. I haven't tried it myself. So not 100% sure of the above solution.

kimlia545 commented 3 years ago

@shahurvi84 check this site https://www.jaided.ai/easyocr/documentation/

width_ths (float, default = 0.5) - Maximum horizontal distance to merge boxes. ex) result = reader.readtext(width_ths=0.1)