golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
122.99k stars 17.54k forks source link

time: bugs in parsing time string with GMT as time zone. #40472

Open ghost opened 4 years ago

ghost commented 4 years ago

What version of Go are you using (go version)?

Go Playground, Go 1.14.6

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
Go Playground.

What did you do?

https://play.golang.org/p/UJyx6fMkbJW

var (
    tests = [][2]string{
        {"Tue Jun 11 2019 13:26:45 GMT+08", "Mon Jan 02 2006 15:04:05 MST-07"},
        {"Tue Jun 11 2019 13:26:45 GMT+0800", "Mon Jan 02 2006 15:04:05 MST-0700"},
        {"Tue Jun 11 2019 13:26:45 GMT+0000", "Mon Jan 02 2006 15:04:05 MST-0700"},
        {"Tue Jun 11 2019 13:26:45 MST+0000", "Mon Jan 02 2006 15:04:05 MST-0700"},
        {"Tue Jun 11 2019 13:26:45 MST+08", "Mon Jan 02 2006 15:04:05 MST-07"},
    }
)

func demo() {
    for _, vl := range tests {
        value, layout := vl[0], vl[1]
        _, err := time.Parse(layout, value)
        fmt.Printf("time.Parse: %q - error: %v\n", value, err)
    }
    fmt.Println("--------------")
}

What did you expect to see?

All tests passed without error.

What did you see instead?

The GMT ones except "GMT+0800" failed with cannot parse "" as "-0700" or cannot parse "" as "-07".

Details:

When parsing time zones, the Parse function calls parseTimeZone, which makes a special case for parsing GMT time zone (which I don't understand why), calling and returning results from parseGMT, as bytes to consume by Parse function.

https://github.com/golang/go/blob/85afa2eb190d5d1a06584803bde4b4ee9b0e79b0/src/time/format.go#L1047-L1059

https://github.com/golang/go/blob/85afa2eb190d5d1a06584803bde4b4ee9b0e79b0/src/time/format.go#L1209-L1212

The parseGMT function tries to return more than 3 bytes, which is the "GMT" string, as in other time zone format. You can see in the playground link, that it returns 8 bytes for "GMT+0000", and leaving nothing to match with "-0700" in the layout string. https://github.com/golang/go/blob/85afa2eb190d5d1a06584803bde4b4ee9b0e79b0/src/time/format.go#L1247-L1280

Interestingly (and confusingly) it does so by checking whether the following string is a signed integer which is in range from -23 to +23, excluding leading zeroes - but it considers "+0800" as 800, which makes it invalid, so parseGMT returns 3, which makes Parse functions without an error.

However, all other values showed in the tests ("+0000", "+08") are considered valid, and making the Parse function consumes too much.

ulikunitz commented 4 years ago

The layout MST matches GMT-3, GMT-03 or GMT-003. GMT-03 is the time zone with a difference in hours from GMT.

The layout MST-0700 requests two time zones and according to the documentation the numeric value has precedence. So your examples would be correct with GMT+08+0800 for layout MST-0700.

I fixed your examples in https://play.golang.org/p/ZxqWuhnEI28

ghost commented 4 years ago

@ulikunitz I was unaware of "MST" and "-0700" is redundant, that part of code makes much more sense now.

However, this behaviour of GMT format is really confusing and works in an unintended way most of the time, (as "+/-0X00" where X is not zero is dominant (mis-)use-case). And I am not sure if any user would expect it to work that way. (I am not sure about what to expect either, for I rarely touch time zones.)

I think at least we need a documentation of how GMT format is parsed.

EDIT:

And maybe make a vet warning about cases of "GMT+0800"? Although the parsed value is "correct", time.Time.Zone() will return "GMT" instead of "GMT+08", which is a difference.

ulikunitz commented 4 years ago

I believe the author of the time package must decide, whether he wants to document the GMT behavior.

It should also be noted that Go doesn't handle time zone abbreviations unless it is the abbreviation of the local time zone or the location provided to ParseInLocation. If that is not the case the time is always interpreted as UTC. Note that GMT+X is actually handled correctly because it gives a difference to UTC:

So numerical offsets should always be preferred over time zone abbreviations as in RFC3339 / ISO 8601, which is the official International Standard to write times.

In general the term UTC (Coordinated Universal Time) should be used instead of GMT, because GMT is ambiguous. GMT is used in navigation for UT0 and not UTC and astronomers calculated GMT starting at noon in the past. UTC is comparison is precisely defined.

agnivade commented 4 years ago

/cc @robpike @ianlancetaylor for time.