james-see / iptcinfo3

iptcinfo working for python 3 finally do pip3 install iptcinfo3
51 stars 31 forks source link

Support for reading image from buffer? #13

Closed ganego closed 10 months ago

ganego commented 4 years ago

Hi, sometimes I need IPTC info not from a file on disk but from a file inside an archive. For this reason I read the image from the archive into memory.
I'm using piexif to get exif from the image and it works with a bytestring/buffer. Any plans to implement this feature to iptcinfo3?

Thank you

nealmcb commented 3 years ago

A workaround would seem to be to use BytesIO, but that fails in a different way for me when I try to save it:

import iptcinfo3
iptcinfo3.__version__
'2.1.4'

from io import BytesIO
# My goal is get the actual photo data out of memory, from a `mailbox` file.
# Here I just fake that by reading it from a file to set things up.
photo = BytesIO(open("photo.jpg", "rb").read()

info = iptcinfo3.IPTCInfo(photo)
info['caption/abstract'] = "test caption"
info.save_as('/tmp/photo-with-tags.jpg')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-69-41d57ed71825> in <module>
----> 1 info.save_as('/tmp/example-iptc-new.jpg')

~/Envs/jupyter/lib/python3.8/site-packages/iptcinfo3.py in save_as(self, newfile, options)
    630         """Saves Jpeg with IPTC data to a given file name."""
    631         with smart_open(self._fobj, 'rb') as fh:
--> 632             if not file_is_jpeg(fh):
    633                 logger.error('Source file %s is not a Jpeg.' % self._fob)
    634                 return None

~/Envs/jupyter/lib/python3.8/site-packages/iptcinfo3.py in file_is_jpeg(fh)
    143     Will reset the file position back to 0 after it's done in either case.
    144     """
--> 145     fh.seek(0)
    146     if debugMode:  # pragma: no cover
    147         logger.info("Opening 16 bytes of file: %r", hex_dump(fh.read(16)))

ValueError: I/O operation on closed file.

Having already read in the photo from the file, I'm surprised it is going back to try to read it again.

TomAnthony commented 3 years ago

I am having the same issue as @nealmcb. Did you happen to find any workaround?

nealmcb commented 3 years ago

@TomAnthony I ended up saving the buffer to a file so I could run this stuff. I guess that for efficiency, it might be useful to use a memory-mapped file system like /dev/shm.

TomAnthony commented 3 years ago

Thanks @nealmcb. I did a similar workaround for now too. I think addressing this would be useful to those that follow (and for efficiency).

james-see commented 2 years ago

@TomAnthony @nealmcb any suggestions on updating this? Finally looking back at this.

nealmcb commented 2 years ago

Thanks, James! I don't know enough about the internals, but my first question from above remains: why does it (seem to be) seeking the file after closing it?

TomAnthony commented 2 years ago

Yeah, I am in the same boat as @nealmcb. I don't know why it should be seeking it after closing. For me performance wasn't a priority so I just worked around it as per @nealmcb 's suggestion above.

james-see commented 2 years ago

@TomAnthony @nealmcb it looks like there is a function that reads raw already. Adding in the ability to chunk in via a method from BytesIO as a build in method inside the IptcInfo class should not be too difficult. Famous last words lol.

james-see commented 2 years ago

@nealmcb I believe it was falling down throwing the error because you tried to sidestep and create the read-in photo object from BytesIO beforehand, when it is expecting a file object there. The error is just erroneous. It should say something like "not a file object to process" or something. That is a guess based on not delving deep into the functions.