madmaze / pytesseract

A Python wrapper for Google Tesseract
Apache License 2.0
5.82k stars 721 forks source link

image_to_data default output type is string #510

Closed karladler closed 1 year ago

karladler commented 1 year ago

Not sure if this can be updated without breaking backwards compatibility, but I got really confused, that the output type for image_to_data() function is actually string. Regarding to the function name IMHO it should be data.frame.

stefan6419846 commented 1 year ago

This is documented explicitly: https://github.com/madmaze/pytesseract/blob/8ae8f2b80c78354ea86dd94ebb8bcd8072d2857c/pytesseract/pytesseract.py#L526-L529

You still can choose between the different representations. I assume that data.frame is not the default as pandas is an optional dependency, thus this might fail.

bozhodimitrov commented 1 year ago

Are you talking about the image_to_data and the param output_type has specific parameter for that one. Please check the docs about the pytesseract.Output class