isawnyu / isaw.web

Isaw website buildout
http://isaw.nyu.edu
1 stars 3 forks source link

uploaded images lack metadata #218

Open paregorios opened 6 years ago

paregorios commented 6 years ago

Many of our images, especially for exhibitions, contain critical metadata in the headers (e.g., copyright, etc.). We insert and manage this standard metadata with custom scripts and exiftool. Any image uploaded to the website and viewed through the plone (therefore resized etc. by the plone) lacks any and all such metadata (i.e., it is stripped in processing/resizing). This is unacceptable for reasons of contracts with photographers and lending institutions.

cguardia commented 5 years ago

I put in a couple of patches on staging that copy exif data to scaled images. It works, but might need some work still. I used up all of my time, so no more time for fixes, but it's a good proof of concept. Please test with a good number of images and comment about results here.

paregorios commented 5 years ago

Steps to test:

paregorios commented 5 years ago

Embedding the original image works, but other registered sizes produce a broken image. Here's my test page:

https://isaw.jazkarta.com/members/paregorian/image-metadata-tests/tests-with-miley-jpeg

skleinfeldt commented 5 years ago

OK that is one cute pup.

cguardia commented 5 years ago

Cute pup now should show up in all sizes.

paregorios commented 5 years ago

Sadly I don't see the desired metadata in the derivative images posted on that page. Compare the downloaded "original", which does preserve the metadata (screenshot from Adobe Bridge, just showing some of the metadata fields):

Screen Shot 2019-09-10 at 1 13 12 PM

And corresponding for the "featured" image. The other derivatives are similarly empty.

Screen Shot 2019-09-10 at 1 19 24 PM
cguardia commented 5 years ago

@paregorios ok, now it should work.

paregorios commented 5 years ago

This is improved. For JPEG originals, I am now seeing some of the original header metadata fields in the derivative. For PNG originals, none of the metadata fields are making it through to the derivative.

Here's a list of the metadata fields I'm especially interested in:

If additional/all fields from IPTC Core and Extension come through, that's fine.

paregorios commented 5 years ago

@cguardia you might care to have a look at one or more of the python wrappers for exiv2. I've played with py3exiv2 in scripts and found it to be very full-featured. That one is a Python 3 only package, so that won't work for us, but it looks like there are other packages floating around in Pypi that claim Python 2 support.

cguardia commented 5 years ago

@paregorios I tried using exiv2 wrappers for Python 2, but they are not in a very useful state.

cguardia commented 5 years ago

@paregorios this is on staging again, using exiftool bindings. Please test with various images. The handling of non-writeable properties is not ideal (throws an error), so I'm adding them to a list that we can ignore when converting. If you see an error when uploading an image with warning at the bottom, make a note of the name of the property mentioned and I will add to our list. I'm thinking there is a finite and not too long list of these, but next week I can look for a better way of handling this if we get too many errors.

alecpm commented 1 year ago

The work here was on staging, but I'm backing it out to get us in sync with master. The work continues to live on branch issues/218-exif if we want to reference it in the future. It seems like we'll want to reimplement using pexif instead of pyexif to avoid dependency on a PERL command line tool which seems to have some issues.

paregorios commented 1 year ago

Any and all tags used in the following groups must be preserved (above and beyond those specifically related to the format of the image being uploaded):

If tags from other groups (e.g., Photoshop) come along, that's fine, but if they're stripped its ok. The above 6 groups, however, are critical.

paregorios commented 1 year ago

fwiw it looks like piexif has been more recently updated than pexif. I have not looked for active forks.

paregorios commented 1 year ago

This ticket was originally awarded 8 points in planning poker a few years ago. Original implementation efforts were not successful. Removing points and re-queueing for estimation.