Closed thomasw-mitutoyo-ctl closed 1 year ago
Looking at the examples inside the README, you can find the following section:
# In order to bypass the image conversions of pytesseract, just use relative or absolute image path
# NOTE: In this case you should provide tesseract supported images or tesseract will return error
print(pytesseract.image_to_string('test.png'))
So there is no need to actually open the image first, which is backed by the corresponding handler method: https://github.com/madmaze/pytesseract/blob/96f73a0c10185fea5f49e11cd1dc644b22770692/pytesseract/pytesseract.py#L194-L196
Thank you.
If I see the process activity correctly, then I need to load the image first (
Image.open()
). PyTesseract then saves the image into a temporary file. Tesseract.exe opens the temporary image and does the OCR.Isn't this a huge waste of performance? Why not load the image directly and process it? Sure, I understand that we want a function that takes an image, e.g. if it was generated and not loaded from disk. But look at all the examples in README.
Suggestion: provide a method that can work on the file directly, without loading it in Python and saving it to a temporary file.