photostructure / exiftool-vendored.js

Fast, cross-platform Node.js access to ExifTool
https://photostructure.github.io/exiftool-vendored.js/
MIT License
437 stars 45 forks source link

Read +00:00 offset from XMP #216

Closed C-Otto closed 3 days ago

C-Otto commented 3 days ago

Describe the bug When reading some photo's metadata from the XMP file with inferTimezoneFromDatestamps: true, the timezone offset +00:00 is not returned in any way.

Example for offset +01:00:

<?xpacket begin='' id='W5M0MpCehiHzreSzNTczkc9d'?>
<x:xmpmeta xmlns:x='adobe:ns:meta/' x:xmptk='Image::ExifTool 12.96'>
<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'>

 <rdf:Description rdf:about=''
  xmlns:exif='http://ns.adobe.com/exif/1.0/'>
  <exif:DateTimeOriginal>2023-01-14T15:56:00.000+01:00</exif:DateTimeOriginal>
 </rdf:Description>
</rdf:RDF>
</x:xmpmeta>
<?xpacket end='w'?>

For this, exiftool.read(path) returns an object that contains "tz": "UTC+1", as expected and desired (with "tzSource": "DateTimeOriginal"). The same behavior exists for other (valid) timezone offsets, aside from +00:00:

<?xpacket begin='' id='W5M0MpCehiHzreSzNTczkc9d'?>
<x:xmpmeta xmlns:x='adobe:ns:meta/' x:xmptk='Image::ExifTool 12.96'>
<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'>

 <rdf:Description rdf:about=''
  xmlns:exif='http://ns.adobe.com/exif/1.0/'>
  <exif:DateTimeOriginal>2023-01-14T15:57:00.000+00:00</exif:DateTimeOriginal>
 </rdf:Description>
</rdf:RDF>
</x:xmpmeta>
<?xpacket end='w'?>

The resulting object does NOT contain tz (nor tzSource). It contains (a bit abridged):

"DateTimeOriginal": {
    "_ctor": "ExifDateTime",
    "tzoffsetMinutes": 0,
    "rawValue": "2023:01:14 15:57:00.000+00:00",
    "zoneName": "UTC",
    "inferredZone": false
}

To Reproduce Read the previous XMP file using exiftool.read(path). Example PR with test: https://github.com/photostructure/exiftool-vendored.js/pull/217

Expected behavior Some valid tz property in the resulting object.

Environment (please complete the following information):

mceachen commented 3 days ago

OK, thanks for adding that test! If you look at the code that's running:

https://github.com/photostructure/exiftool-vendored.js/blob/669a4112f8de992c9d4509784c3a51110a292303/src/Timezones.ts#L431

You'll see my comment from a while back that the current behavior is intended, and is a bug fix.

There are a bunch of applications, like Google Takeout, that add spurious +00:00 and Z suffixes to timestamps, when that is not the correct timezone or no timezone is inferable.

Just to step back a bit, and to emphasize: timezone extraction from timestamps is a hack.

It defaults to false, as this heuristic has proven to not be reliable in the field.

Yes, I could cook up Yet Another Option, and add even more complexity to the timezone heuristics to work around metadata bugs in Google Takeout and others.

I would argue that this is just papering over the underlying issue, though: timezones just aren't set reliably for large swaths of files.

After several years of fighting this, I made PhotoStructure store dates in "local" time. This approach allows timezones to be non-critical metadata, and has proven to be much less problematic. I'd recommend that instead.

Thanks again for making sure we're talking about the same issue!

C-Otto commented 3 days ago

Timezones, or rather the offsets, are important in the context of Immich. I understand that there are bugs in external tools but please be aware that +00:00 is used in a bunch of countries (many of them in Africa) and currently photos taken there cannot be used as expected - which is another bug, this time in Immich. Furthermore, I'm not aware of providing a timezone (offset) anywhere else using XMP, which makes fixing a wrong timezone offset using XMP impossible due to this issue.

Please re-consider this.

C-Otto commented 1 day ago

@mceachen do you think it's worthwhile to limit this fix to XMP files? In other words, are you aware of bugs (Google Takeout and the like) that create misleading "+00:00" (or "Z") offsets in XMP files? If not, parsing those would help a lot, as users would be able to manually provide the offset.

mceachen commented 15 hours ago

Hmm! That’s an interesting workaround, but Samsung phones (not in XMP or JSON sidecars) also use Z or +00:00 spuriously. I’ll add another predicate so people can tweak timeshare extraction behavior to suit them.