photostructure / exiftool-vendored.js

Fast, cross-platform Node.js access to ExifTool
https://photostructure.github.io/exiftool-vendored.js/
MIT License
445 stars 47 forks source link

Zero time zone offset +00:00 is parsed as UTC #203

Closed C-Otto closed 1 month ago

C-Otto commented 2 months ago

Describe the bug For date strings that end with, for example, "+03:30" the zone is correctly returned as "UTC+3:30":

https://github.com/photostructure/exiftool-vendored.js/blob/d6fa63469aebb911a7ac929ab287e7d6fe503bec/src/ReadTask.spec.ts#L326-L338

The same does not hold for the "+00:00" offset suffix. In this case, just "UTC" is returned. While "UTC" is the expected response for the "Z" suffix, it should not be used for more specific suffixes like "+00:00".

According to https://github.com/photostructure/exiftool-vendored.js/issues/179#issuecomment-2052427101 this should already work. However, the code is quite explicitly throwing away information about the "+00:00" suffix:

https://github.com/photostructure/exiftool-vendored.js/blob/d6fa63469aebb911a7ac929ab287e7d6fe503bec/src/Timezones.ts#L216

https://github.com/photostructure/exiftool-vendored.js/blob/d6fa63469aebb911a7ac929ab287e7d6fe503bec/src/Timezones.ts#L107-L127

The following countries (partially) use a time zone offset of +00:00:

To Reproduce

    it("respects zero HH:MM OffsetTime", () => {
          // this is not UTC, but it is used for "Atlantic/Reykjavik" and other zones
          const t = parse({
              tags: {
                  OffsetTime: "+00:00",
                  DateTimeOriginal: "2016:08:12 13:28:50",
              },
          })
          expect(t.DateTimeOriginal).to.containSubset({
              tzoffsetMinutes: 0,
              zone: "UTC+00:00",
              inferredZone: true,
          })
      })

Expected behavior The resulting time zone is any of "UTC+0", "UTC+00", "UTC+00:00", or a similar variant - but not "UTC".

Environment (please complete the following information):

C-Otto commented 2 months ago

It seems the timezone derived from the GPS coordinates is always returned if the time itself uses the +00:00 offset.

mceachen commented 1 month ago

Are you setting backfillTimezones: true? Your test passes for me.

C-Otto commented 1 month ago

I did not fiddle with the defaults. The test you added in d5d27aeac1bddd8858d21f7182330600a8081cce asserts the wrong thing. The result should be +00:00, not UTC.

mceachen commented 1 month ago

https://github.com/photostructure/exiftool-vendored.js/blob/ceef9d4861b2022021242da757ccfcdba78ebacd/src/ReadTask.spec.ts#L285 is the current code, and validates that the extracted tz is UTC-equivalent.

We're talking at odds here: please provide a failing test, so I know exactly what your issue is.

C-Otto commented 1 month ago

The failing test is included in my initial issue, in "To Reproduce". As stated above:

Expected behavior The resulting time zone is any of "UTC+0", "UTC+00", "UTC+00:00", or a similar variant - but not "UTC".

Your test expects UTC, which is the current and (I think) wrong behavior.

C-Otto commented 1 month ago

Your test passes for me.

It does not for me:

 AssertionError: expected { tzoffsetMinutes: +0, …(2) } to eql { tzoffsetMinutes: +0, …(2) }: {"key":"zone","act":"UTC","exp":"UTC+00:00"}
      + expected - actual

       {
         "inferredZone": true
         "tzoffsetMinutes": 0
      -  "zone": "UTC"
      +  "zone": "UTC+00:00"
       }
mceachen commented 1 month ago

UTC and UTC+00:00 are functionally equivalent, which is why I updated the test.

C-Otto commented 1 month ago

Please reopen. For photos with non-0 offset, the timezone is parsed and returned. For UTC+00:00 this is not the case. As an example, for the following XMP file the timezone related output differs depending on the offset used:

<?xpacket begin='<feff>' id='W5M0MpCehiHzreSzNTczkc9d'?>
<x:xmpmeta xmlns:x='adobe:ns:meta/' x:xmptk='Image::ExifTool 12.96'>
<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'>

 <rdf:Description rdf:about=''
  xmlns:exif='http://ns.adobe.com/exif/1.0/'>
  <exif:DateTimeOriginal>2018-08-18T00:00:00.000+00:00</exif:DateTimeOriginal>
 </rdf:Description>

 <rdf:Description rdf:about=''
  xmlns:photoshop='http://ns.adobe.com/photoshop/1.0/'>
  <photoshop:DateCreated>2018-08-18T00:00:00.000+00:00</photoshop:DateCreated>
 </rdf:Description>
</rdf:RDF>
</x:xmpmeta>
<?xpacket end='w'?>

With +02:00 instead of +00:00 exiftool-vendored returns:

"tz": "UTC+2",
"tzSource": "DateTimeOriginal",

These properties are missing with +00:00.

This causes issues for photos taken in, for example, Iceland. There is a difference between "no timezone" and "+00:00 offset", but this difference is not communicated by exiftool-vendored.

C-Otto commented 1 month ago

Furthermore, as hinted at above, if the +00:00 offset is used, other TimeZone related tags override this offset. If, for example, GPS coordinates are set, there is no way to override the derived timezone with information stored in a XMP file. This is only an issue for +00:00.

mceachen commented 1 month ago

Please offer a PR with tests if you'd like your issue addressed: I feel like I'm missing the full context of the issue, and a solution might be enlightening.

Keep in mind that I had to disregard some values of TimeZone offsets due to a nontrivial number of cameras (particularly from Samsung) use a default value instead of leaving the field unset.

mceachen commented 1 month ago

FWIW I've just added a new option that may skirt this issue:


  /**
   * Timezone parsing requires a bunch of heuristics due to hardware and
   * software companies not following metadata specifications similarly.
   *
   * If GPS metadata is trustworthy, set this to `true` to override explicit
   * values assigned to {@link TimezoneOffsetTagnames}.
   *
   * Note that there **are** regions that have had their IANA timezone change
   * over time--this will result incorrect timezones.
   */
  preferTimezoneInferenceFromGps: boolean

You'll want to set it to true (it defaults to false to retain prior behavior).

If this suffices, feel free to close this issue.

C-Otto commented 1 month ago

It's messy, yes. In general I believe that reading TZ related information via GPS tags is fine. I mentioned GPS just to highlight the issue that certain changes are not possible for the +00:00 offset, while they are possible for any other offset. I'll open separate issues (with tests) so that we can discuss those issues individually. Thanks :)

C-Otto commented 1 month ago

FYI: I believe all related issues I'm aware of will be fixed once #216 is fixed (so that +00:00 is read and put into "tz"). Sorry for the confusion, the integration into Immich makes this already complicated topic even harder to debug.