Open nickmooney opened 5 years ago
This looks like a bug to me, Parse
shouldn't behave inconsistently with inputs that only differ based on the offset, I think it should definitely error in all cases where the MST
token matches an incompatible input value. Note that leaving off -0700
should work, as MST
will parse as zero based on the documentation:
Elements omitted from the value are assumed to be zero or, when zero is impossible, one, so parsing "3:04pm" returns the time corresponding to Jan 1, year 0, 15:04:00 UTC (note that because the year is 0, this time is before the zero Time). Years must be in the range 0000..9999. The day of the week is checked for syntax but it is otherwise ignored.
/cc @ianlancetaylor
Change https://golang.org/cl/167381 mentions this issue: time: fail when Parse is given a stdNumTZ as stdTZ
I would expect time.Parse to fail to parse the string, since it's suffixed with -0000 rather than an alphabetical time zone designator.
This expectation is not correct. Not every timezone has a three-letter name, so it was decided to allow numeric timezone names (like +07
) to match MST
. This was implemented in commit 9f2c611fa7cccfacd317b88eff14d3ba64e855ff. This allows Parse
to keep working when the format specifier ask for a timezone names (using MST
), but the function is given a correct time string for a timezone with no numeric name.
This expectation is not correct. Not every timezone has a three-letter name, so it was decided to allow numeric timezone names (like
+07
) to matchMST
.
Yes, timezones of +07
or GMT+3
should be parsed correctly as a token of type stdTZ
, but -0000
is a stdNumTZ
and shouldn't be parsed as a standin for the MST
token.
Yes, timezones of +07 or GMT+3 should be parsed as a token of type stdTZ, but -0000 is a stdNumTZ
But how do you differentiate between a numeric timezone name and an stdNumTZ
? What's the rule?
In the layout
string, a value of "MST"
encodes to a stdTZ
(which accepts values like "MST"
, "PDT"
, "GMT+3"
, "+07"
.
A value of "-0700"
in the layout string encodes to a stdNumTZ
, which should be in a form that looks like "+0000"
, "-0700"
, etc.
This behavior is already part of time.Parse
. This issue and the associated pull request just fixes the case where -0000
is parsed as a stdTZ
(since 0
is between -23 and 23), but -0700
is not (since 700
is greater than 23
). In fact, neither of those values are acceptable values for a stdTZ
. This issue and pull request do not change the behavior of string parsing where the layout string contains -0700
(signaling that a stdNumTZ
is expected).
but -0700 is not (since 700 is greater than 23)
My point is that this is wrong. It's also the current behaviour, but now your CL is aligning Parse, for 0000
, to a behaviour that is probably not correct. In fact, I argue that both -0000
and -0700
are valid numeric timezones names. Currently, Parse rejects the second, and after your patch it'll also reject the first. But the correct behaviour is to accept both, since the criteria "let's parse NNNN as an integer and compare that with 23" is not correct.
They are valid numeric timezone names, but not valid non-numeric timezone names. If you use -0700
in your layout string for Parse
, both -0000
and -0700
are accepted as you would expect. Parse
recognizes a difference between stdTZ
and stdNumTZ
in the layout string ("MST"
vs "-0700"
).
Are you suggesting that they should be interchangeable? Because that also solves this bug, but seems to me to be non-obvious behavior. My initial reaction is that I would be surprised if I was able to hand a non-numeric timezone designator to a function that uses a layout string specifying a 4-digit offset. Otherwise, why recognize a difference between these two types in the layout string?
Here's how I understand it:
MST
should cause Parse
to eat anything that could potentially be an abbreviated timezone name. This includes the following: EST
, EEST
, +07
, -05
, +00
, +0000
, +0500
, +0330
.
-0700
should cause Parse
to exclusively accept numeric timezones.
This is how I read the documentation, where it says:
stdTZ // "MST"
...
stdNumTZ // "-0700" // always numeric
so no, I don't think they are interchangeable, since with -0700
Parse
would reject e.g. EEST
.
Ah, I understand where you're coming from. That would be a fine solution as well, as long as behavior is consistent when parsing strings that are suffixed with both -0000
and -0700
.
/cc @robpike
// parseSignedOffset parses a signed timezone offset (e.g. "+03" or "-04"). // The function checks for a signed number in the range -23 through +23 excluding zero. // Returns length of the found offset string or 0 otherwise
This function is expecting "+03" and accepting "-0000" (unexpectedly!) but rejecting "-0700" (because 700 is well out of range). It's unclear what the right answer is - should parseSignedOffset allow 4-digit offsets in order to deal with fractional-hour time zones?
should parseSignedOffset allow 4-digit offsets in order to deal with fractional-hour time zones?
This will be necessary anyway to fix #26032. For example, if you zdump Asia/Kabul
you get:
Sun Aug 5 16:25:10 2018 +0430
which time.Parse
mistakenly rejects because it doesn't like the +0430
part.
What are the next steps required for this?
@nickmooney We need to decide on the right thing to do.
Gotcha, @ianlancetaylor. I haven't contributed to Go before, and I'm not super familiar with who is in charge of making those decisions. It seems like we have two options -- is there a group of people who are the "right people" to make that decision?
Yes, there is a group of people who periodically review NeedsDecision
issues.
In this case it would be interesting to hear whether @robpike has an opinion.
Perhaps parseSignedOffset should allow 4-digit offsets. I'm not sure.
What version of Go are you using (
go version
)?Local machine:
Docker container:
Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
Go Playground example here.
I attempted to use the
time.Parse
format string"Mon, 2 Jan 2006 15:04:05 MST"
to parse the date"Tue, 12 Mar 2019 15:34:39 -0000"
.What did you expect to see?
I would expect
time.Parse
to fail to parse the string, since it's suffixed with-0000
rather than an alphabetical time zone designator.This behaves correctly when I try to parse a date with a non-zero offset, such as
"Tue, 12 Mar 2019 15:34:39 -0700"
(i.e. parsing fails).What did you see instead?
time.Parse
successfully parses date strings with TZ offsets of zero given a format string that ends withMST
, when it should likely fail if there is no alphabetical timezone designator.