time-rs / time

The most used Rust library for date and time handling.
https://time-rs.github.io
Apache License 2.0
1.06k stars 261 forks source link

Current RFC 3339 parsing implementation requires "T" #607

Closed Auion closed 11 months ago

Auion commented 11 months ago

The gist: Sometimes RFC 3339 is written without letter "T". This crate's implementation requires "T" or will error (InvalidLiteral). RFC 3339 does not appear to require "T".

Using serde, I made a generic function to deserialize various CSV files within a zip, independent of their type. It appears that the dates are in RFC 3339 format but they failed to parse due to an "invalid literal". By digging through the source code, I found that an RFC 3339 formatted date will only be successfully parsed if it contains "T" (case insensitive) between the date and time, as so: 2023-08-01T21:48:43Z An example date from my situation: 2018-06-30 00:07:54.000000+00:00. This example fails to parse.

RFC3339 (pg. 8) suggests that other characters may be used, such as a space. The relevant excerpt:

NOTE: ISO 8601 defines date and time separated by "T". Applications using this syntax may choose, for the sake of readability, to specify a full-date and full-time separated by (say) a space character.

Some programs do indeed omit "T". An easy example would be date. Running date --rfc-3339=seconds outputs 2023-08-01 21:36:22-06:00.

I find the wording particularly interesting. Writing, "by (say) a space character." seems to suggest that any character may be used in this spot.

I've never really done anything with git or Github before, and I'm quite new to Rust, so bear with me on this next part.

I created two branches that fix this in different ways. This branch simply allows a space character to be used, in addition to "T" or lowercase "t". This branch is a bit different. It allows any character to be used in place of "T", which aligns more closely with RFC 3339.

That is all I have for now. Let me know what y'all think, Thank you

jhpratt commented 11 months ago

The note is unclear at best. There is nothing in any RFC I'm aware of that specifies how a "NOTE" is to be interpreted, unlike similar language such as "MUST" and "SHOULD". For that reason I am interpreting it as non-normative, such that it's not actually part of the specification. The implementation in time follows the ABNF strictly, which is the only formal declaration of syntax.