IJMacD / rfc3339-iso8601

https://ijmacd.github.io/rfc3339-iso8601
291 stars 11 forks source link

Odd allowances in RFC3339 format? #16

Closed jsumners closed 1 year ago

jsumners commented 1 year ago

https://github.com/IJMacD/rfc3339-iso8601/blob/f1e240fd9a4aacdea4bbd588a47a562fe4d59f4f/src/formats/rfc.js#L46-L47

The above formats replace the T with ` and_`. I'm not clear that this is actually allowed in RFC3339. I assume you're following this note in the spec:

https://www.rfc-editor.org/rfc/rfc3339#section-5.6

NOTE: ISO 8601 defines date and time separated by "T". Applications using this syntax may choose, for the sake of readability, to specify a full-date and full-time separated by (say) a space character.

However, the actual ABNF is states:

date-time = full-date "T" full-time

With a note stating that the T may be represented by t.

In my understanding, the above note about 8601 allowing any character to separate the date and time parts is specifically talking about 8601. Am I just missing something in the RFC that states any character may be used in place of T?

jsumners commented 1 year ago

🤔 it seems I'm not the only one confused by "this syntax" https://www.rfc-editor.org/errata/eid5783

jsumners commented 1 year ago

For those who come along and read this issue, I reached out to Graham Klyne (one of the RFC3339 authors) for clarification. He helped me understand the intention of the RFC with the following:

RFC3339 was written as a specification to be referenced by other specifications, and as such it provides syntax productions that might be used in different ways. The intention was that if you want an interoperable timestamp then the production to use is ‘date-time’, which allows only ‘T’ as the date/time separator.

But there may be circumstances in which one wishes to convey just a date, or a time, in which case the other predictions might be appropriate. Or if it’s a context not intended for widespread interoperability then the productions ‘full-date’ and ‘full-time’ may be specified, combined with some other separator.

So it comes down to how RFC3339 is referenced. A simple option would be for an application to specify a timestamp as “defined by the date-time syntax production in RFC3339”, in which case the only allowed separator is ‘T’. For other options then the application speciation needs to do s a little more work. Simply saying, say, “timestamp defined by RFC3339” is underspecified.

Basically, the formats I linked to in the opening of this issue are not RFC3339 date-time productions. They are localized (for lack of a better term) productions of full-date + full-time with a implementation chosen separator.

IJMacD commented 1 year ago

Hi James,

Thank you for raising, investigating, and closing this issue.

You're absolutely correct in everything you've said.

The strictest form of RFC 3339 is of course defined by the ABNF.

However, as you pointed out, the original intention of the RFC was to be a starting point to be integrated into other specs; in which case the written clause becomes relevant.

A newly defined API specification, for example, can choose full-date from RFC 3339 and be done with it; or choose to replace the T and still be compliant with the wording of RFC 3339.

This was my justification for inclusion of these formats in the RFC 3339 sets on the page.

Graham Klyne, it seems, has been consistent on this issue. Here is a comment from him on a StackOverflow question from 2019 stating a consistent opinion to the one he gave directly to you.
https://stackoverflow.com/questions/522251/whats-the-difference-between-iso-8601-and-rfc-3339-date-formats#comment95830950_522281

Just a final point since you specifically mentioned the case insensitivity (T/t). This is not specific to RFC 3339.

ABNF is defined in RFC 2234, which says in § 2.3 that terminal values (i.e. strings) are case insensitive.

Literal text strings are interpreted as a concatenated set of printable characters.

   NOTE:     ABNF strings are case-insensitive and
             the character set for these strings is us-ascii.

Hence:

   rulename = "abc"

and:

   rulename = "aBc"

will match "abc", "Abc", "aBc", "abC", "ABc", "aBC", "AbC" and "ABC".

           To specify a rule which IS case SENSITIVE,
              specify the characters individually.

For example:

   rulename    =  %d97 %d98 %d99