planetarypy / pvl

Python implementation of PVL (Parameter Value Language)
BSD 3-Clause "New" or "Revised" License
19 stars 19 forks source link

PDS3 can have "local" times. #80

Closed rbeyer closed 3 years ago

rbeyer commented 3 years ago

Describe the bug TLDR:

So, remember the long discussion we had last spring about Python datetime objects(#57 ) and whether they were "naive" (no sense of the time zone they belonged to) or "aware" (they knew their offset from UTC)? That discussion was in the context of the PVL specification, which absolutely requires that all times be interpreted as relative to UTC. On the plus side, that's still true.

However, the ODL/PDS3 spec ... differs. ODL says

A time that is not followed by either the Zulu indicator or a time zone offset is assumed to be a local time.

In this context a "local time" would be a "naive" datetime object. So we need to adjust pvl.decoder.ODLDecoder() so that it will return "naive" and "aware" datetime objects as appropriate.

Then the PDS3 specification about times says:

Alternate time zones (e.g., YYYY-MM-DDTHH:MM:SS.SSS + HH:MM) may not be used in a PDS label. The only allowed time formats are (1) YYYY-MM-DDTHH:MM:SS.SSS. (2) YYYY-DDDTHH:MM:SS.SSS.

Note, this doesn't allow a trailing "Z" like ODL does, so we need to fix that.

I guess I had interpreted that to mean that PVL only allowed "aware", and mis-interpreted that to mean that ODL and PDS3 followed suit. But that's not what that says. It says that ODL allows "naive" times, and then only that PDS3 doesn't allow you to specify an offset with "+HH:MM" or "Z". Since I had thought that ODL was like PVL, I had made the mental leap and thought that PDS3 was only UTC, but you could also mentally load the "ODL" rule, and then read the "PDS3" rule to mean that a strict PDS3 encoder should only return "naive" datetimes since ODL allows it, and you can't specify a time offset.

So, what to do about PDS3? On the one hand, since the ODL spec allows "local times" and the PDS3 part doesn't contradict that then maybe that means that PDS3 "Time" values should be interpreted as "naive" datetime objects, and since "correct" PDS3 labels will never have a "+HH:MM" or "Z" suffix, you would always return "naive" datetime objects if parsing. On the other hand, since PDS3 "Time" values can never specify their timezone, does that mean they should always be interpreted as UTC, and thus should always be "aware" as they are now (via the pvl.decoder.ODLDecoder())?

Not sure. Thoughts? Help?

To Reproduce

ODL decoders always return "aware" datetime objects (tzinfo is not None), even when there is no offset specifier:

>>> import pvl
>>> odl = pvl.loads("time = 2000-02-25T10:54:12.129", decoder=pvl.decoder.ODLDecoder())
>>> print(odl)
PVLModule([
  ('time',
   datetime.datetime(2000, 2, 25, 10, 54, 12, 129000, tzinfo=datetime.timezone.utc))
])

And then, the PDS3Encoder mistakenly adds a "Z" (assume odl from above):

>>> print(pvl.dumps(odl, encoder=pvl.encoder.PDSLabelEncoder()))
TIME = 2000-02-25T10:54:12.129000Z
END

Expected behavior

Your Environment (please complete the following information):

rbeyer commented 3 years ago

I spent some time writing some code to address the above, and then realized that I may not have read the spec correctly, and that some of the above was incorrect. I should have spent time with the specifications and then written code, and now regret my life choices. Anyway, what follows is my "scholarship" on the PVL specification documents regarding time, and what that means for how we should handle them in the pvl library. Apologies, this is mostly to keep my head clear (hopefully) as I go write the code.

The upshot is that the first "Expected Behavior" item is still correct, the second isn't, and there are others. Here's the roundup of changes to pvl 1.1.0 that need to happen to properly implement all of these time specifications (assuming I read them correctly this time):

That should be it for the changes, but to arrive at them, I had to work through the very long text below. I'll just be using the term time to refer to datetime.time, and since the datetime.datetime class descends from the datetime.time class, all of this applies to them equally.

PVL Spec (CCSDS Blue Book)

Section 2.3.2.1.3:

This logcial set of statements, nicely contained in a single section allows a straightforward implementation:

Decoding, pvl.load()

Encoding, pvl.dump()

ODL Spec (Chapter 12 of the PDS3 Standards Reference)

Section 12.3.2.1, Date and Time Values

Section 12.3.2.2, Implementation of Dates and Times

Section 12.3.2.5, Times

Only the differences from PVL handling are specified below, and assume inheritance from PVL handling above.

Decoding, pvl.load()

Encoding, pvl.dump()

PDS3 Spec (Chapter 12 of the PDS3 Standards Reference)

Section 12.3.2.3, PDS Implementation of Dates and Times

Section 12.7.3, ODL/PVL Usage, Sub-para 14:

Alternate time zones (e.g., YYYY-MM-DDTHH:MM:SS.SSS + HH:MM) may not be used in a PDS label. The only allowed time formats are (1) YYYY-MM-DDTHH:MM:SS.SSS. (2) YYYY-DDDTHH:MM:SS.SSS.

See Section 7.3.2(6) for a more detailed description.

The section 7.3.2(6) does not appear in this document that I could find.

If one reads this para in isolation, one might interpret this as indicating that no timezone specifier is allowed, however, this combined with Section 12.3.2.3 indicates that a trailing "Z" is allowed on read, and required on write.

Section 12.7.3, ODL/PVL Usage, Sub-para 14:

Only the differences from ODL handling are specified below, and assume inheritance from ODL and PVL handling above.

Decoding, pvl.load()

Encoding, pvl.dump()