inukshuk / edtf.js

Extended Date Time Format (ISO 8601-2 / EDTF) Parser for JavaScript
BSD 2-Clause "Simplified" License
66 stars 12 forks source link

Parse error on trailing zeros #48

Closed mielvds closed 10 months ago

mielvds commented 10 months ago

Hi there,

Thanks for this great library! I was trying to parse 2020-05-18T22:39:24.422000Z, but it throws an error because it doesn't expect the trailing zeros. However, I think this should be valid? at least for new Date() it is.

Cheers,

Miel

inukshuk commented 10 months ago

I don't have the ISO 8601 standard by hand, but I've only ever seen strings with three-digit milliseconds. The parser is supposed to accept only valid ISO date strings so I think that throwing an error in the example above is the expected result. The JS Date constructor allows some non-ISO formats which this parser has to reject.

inukshuk commented 10 months ago

Experimenting with this a little bit: (new Date('2024-01-10T14:17:32.1234Z')).toISOString() gives '2024-01-10T14:17:32.123Z' so the JS implementation also rounds the ISO format to three digits. .getMilliseconds() also is 123 so it's not only that the ISO format drops the number but that the parser ignores it silently.

mielvds commented 10 months ago

Yep, you're absolutely right! I guess you could consider adding a 'non-strict' mode, but I would also understand if you wouldn't ;)

inukshuk commented 10 months ago

Right, we sort of hijacked the extension levels for this already. There is a level 3 (the standard only goes to level 2) where we added some extra features. We could make the parser accept more sub-second precision there. However, even if the parser accepted the extra precision the rest of the API is built on top of the standard Date object for storage which doesn't store the extra digits. That is, if you used anything but zeroes that information would be parsed and lost. Therefore, to support this properly I think we'd have to store the milliseconds separately, which I think we should only do if there's a strong reason for it.

Where do you get these date strings from?

mielvds commented 10 months ago

Where do you get these date strings from?

A Media Asset Management system that does not take standards compliance seriously, as there are many unfortunately. However, we also get a lot of xsd:dateTime and its unclear whether the XSD spec disallows this.

This is from the ISO 8601:2004 spec btw (I don't have access, but I got it from https://stackoverflow.com/questions/25842840/representing-fraction-of-second-with-iso-86012004):

4.2.2.4 Representations with decimal fraction

The interchange parties, dependent upon the application, shall agree the number of digits in the decimal fraction. The format shall be [hhmmss,ss], [hhmm,mm] or [hh,hh] as appropriate (hour minute second, hour minute, and hour, respectively), with as many digits as necessary following the decimal sign. A decimal fraction shall have at least one digit.

So the three digit limit might be a convention rather than a rule

inukshuk commented 10 months ago

OK in that case we can probably amend the grammar without doing any harm.

inukshuk commented 10 months ago

Though, as I said the underlying JS Date will likely just drop any extra digits there.

mielvds commented 10 months ago

sure, but then at least they can be parsed. Thanks!

inukshuk commented 10 months ago

OK 4.6.0 should accept any number of decimal fractions, but only up to three will be used.