samvera / hydra-works

A ruby gem implementation of the PCDM Works domain model based on the Samvera software stack
Other
24 stars 14 forks source link

Multiple, conflicting values for Exif Version in characterization metadata #337

Closed kefo closed 5 years ago

kefo commented 6 years ago

We have more than 200,000 pcdm:Files in our collection with multiple and unequal exif:exifVersion values.

There are, of course, multiple values in the fits output:

      <exifVersion toolname="Exiftool" toolversion="10.00" status="CONFLICT">0230</exifVersion>
      <exifVersion toolname="NLNZ Metadata Extractor" toolversion="3.6GA" status="CONFLICT">0320</exifVersion>

(I actually see 3 exifVersion values in our data sometimes and, for the life of me, I cannot figure out whence the third comes, but, again, I digress.)

Given the CONLICTing values above, I'm not sure what to do about this other than raise it as an issue.

In so far as this results in duplicate, conflicting values in the characterization metadata, it is related to https://github.com/samvera/hydra-works/issues/336.

kefo commented 6 years ago

The most recent version of EXIF is 2.31, which would mean that NLNZ's value in the above snippet is incorrect. It's probably not taking into account endianness, but, again, I digress.

Assuming that to be the case, is the best solution to rely on ExifTool's value?

Something like:

 t.exif_version(path: 'exifVersion', attributes: { toolname: "Exiftool" })

See:

https://github.com/samvera/hydra-works/blob/v0.17.0/lib/hydra/works/characterization/fits_document.rb#L62

jrgriffiniii commented 5 years ago

Resolved with https://github.com/samvera/hydra-works/pull/338