Open pudo opened 6 years ago
There's the SetImageBytes API method.
Sorry my ticket was very badly phrased. I saw that method, but the additional arguments it takes would require me to open the image first to inspect its metadata, anyway. Yet I saw that the setImage method internally just stuffs the saved image data into a pix, which seems much simpler. Can this be exposed, rather than the full signature of setImageBytes? Or can I pass stub defaults into setImageBytes, like -1?
So basically you want a method which sets an image directly from a given bytes buffer. PRs are welcome.
That cannot work. Tesseract's image datastructure pix
(from Leptonica) needs to know what format the input is in. Either it's a full byte stream of some standard image format (recognizable by its header) – that's the alley SetImage()
/ PIL.Image.save()
/ pixReadMem()
is taking –, or it's a raw byte stream and you pass its format (height, width, depth) explicitly – that's where SetImageBytes()
will take you.
I recommend closing.
I'm interested too. could something like this work?
def SetImageMem(self, buffer):
"""Set image from buffer for Tesserac to recognize.
Args:
buffer (bytes): Buffer.
Raises:
:exc:`RuntimeError`: If for any reason the api failed
to load the given image.
"""
cdef:
bytes raw = _b(buffer)
with nogil:
self._destroy_pix()
self._pix = pixReadMem(raw, len(raw))
if self._pix == NULL:
with gil:
raise RuntimeError('Error reading image')
self._baseapi.SetImage(self._pix)
It seems to me like setImage() basically just calls Pillow's save method and then loads the resulting byte stream into tesseract. Is there a way of loading a bytes object directly into Tesseract, without opening it in Pillow first?