tomaszkam / date

A date and time library based on the C++11/14/17 <chrono> header
Other
0 stars 0 forks source link

[LWG3252] Parse locale's aware modifiers for commands are not consistent with POSIX spec #37

Closed tomaszkam closed 5 years ago

tomaszkam commented 5 years ago

The [time.parse.spec] Meaning of parse flag does not support a modified commands of 'OI%', '%OU', '%OW', while the formating supports it. Also POSIX strptime supports it (http://pubs.opengroup.org/onlinepubs/9699919799/functions/strptime.html).

We claim to support '%Ou', '%OC' that is not supported by POSIX.

tomaszkam commented 5 years ago

@HowardHinnant: Is this just an oversight or the intended design? I am lacking the knowledge about time_get to confirm it.

HowardHinnant commented 5 years ago

This looks like an oversight, both in the spec and the implementation. If POSIX strptime supports it, so should we:

https://pubs.opengroup.org/onlinepubs/009695399/functions/strptime.html

tomaszkam commented 5 years ago

Added also '%OV'.

HowardHinnant commented 5 years ago

Note that there are some modified flags that strftime supports and strptime doesn't (don't ask me why). %OV is one such flag. I think we should be consistent with POSIX's inconsistency. :-) Rationale, std::get_time is already specified in terms of strptime, and we can't afford to be inventive with locale-dependent things.

tomaszkam commented 5 years ago

POSX strptime does not support '%Ou' while we support it, at least in spec. Should we remove it?

HowardHinnant commented 5 years ago

Ouch! Yes. That is definitely a bug on my part. Thanks!

I only invented in a couple places for these flags, and none of them involved locale-dependent stuff. If we are inconsistent with POSIX/C on this, it's a bug.

tomaszkam commented 5 years ago

There is also '%OC' that does not exists in strptime, I assume also to be removed.

HowardHinnant commented 5 years ago

Thanks for doing this review. I can't believe how many of these I got wrong. My eyes must've been crossed when I went through these...

tomaszkam commented 5 years ago

I just reported the issue with missing modifiers in format for '%y'/'%Y' and decided it is worth to double check it against POSIX spec. Then checked it against parse table and found these inconsistencies.

tomaszkam commented 5 years ago

I have found also https://github.com/tomaszkam/date/issues/34, but reported it to be fixed editorially, as it is an obvious typo.

tomaszkam commented 5 years ago

Discussion: The current specification of the locale modifiers for the parse flags in "[tab:time.parse.spec] Meaning of parse flags" is inconsistent with the POSIX strptime specification (https://pubs.opengroup.org/onlinepubs/009695399/functions/strptime.html):

Per Howard's comment:

I only invented in a couple places for these flags, and none of them involved locale-dependent stuff. If we are inconsistent with POSIX/C on this, it's a bug. Rationale, std::get_time is already specified in terms of strptime, and we can't afford to be inventive with locale-dependent things.

Note that, due above, the inconsistency between POSIX strftime specification (http://pubs.opengroup.org/onlinepubs/9699919799/functions/strftime.html) that supports %Ou and %OV that are not handled by strptime (https://pubs.opengroup.org/onlinepubs/009695399/functions/strptime.html) should be (by design) reflected in the "[tab:time.format.spec] Meaning of conversion specifiers" and "[tab:time.parse.spec] Meaning of parse flags" tables.

The %d modifier was addressed by http://cplusplus.github.io/LWG/lwg-active.html#3218.

Proposed wording: Change the entries in the "[tab:time.parse.spec]— Meaning of parse flags" as follows: Entry %C: The century as a decimal number. The modified command %NC specifies the maximum number of characters to read. If N is not specified, the default is 2. Leading zeroes are permitted but not required. The modified commands %EC and %OC interprets the locale's alternative representation of the century.

Entry %I: The hour (12-hour clock) as a decimal number. The modified command %NI specifies the maximum number of characters to read. f N is not specified, the default is 2. Leading zeroes are permitted but not required. The modified command %OI interprets the locale's alternative representation.

Entry %u: The ISO weekday as a decimal number (1-7), where Monday is 1. The modified command %Nu specifies the maximum number of characters to read. If N is not specified, the default is 1. Leading zeroes are permitted but not required. The modified command %Ou interprets the locale's alternative representation.

Entry %U: The week number of the year as a decimal number. The first Sunday of the year is the first day of week 01. Days of the same year prior to that are in week 00. The modified command %NU specifies the maximum number of characters to read. If N is not specified, the default is 2. Leading zeroes are permitted but not required. The modified command %OU interprets the locale's alternative representation.

Entry '%W': The week number of the year as a decimal number. The first Monday of the year is the first day of week 01. Days of the same year prior to that are in week 00. The modified command %NW specifies the maximum number of characters to read. If N is not specified, the default is 2. Leading zeroes are permitted but not required. The modified command %OW interprets the locale's alternative representation.