brendanhay / amazonka

A comprehensive Amazon Web Services SDK for Haskell.
https://amazonka.brendanhay.nz
Other
599 stars 227 forks source link

Serialize error: Failure parsing Time from value #866

Closed hasufell closed 1 year ago

hasufell commented 1 year ago
cabal-cache: SerializeError (SerializeError' {_serializeAbbrev = Abbrev "S3", _serializeStatus = Status {statusCode = 200, statusMessage = ""}, _serializeBody = Nothing, _serializeMessage = "Failed reading: Failure parsing Time from value: \"Thu, 15 Dec 2022 08:43:15 PST\""})

https://cirrus-ci.com/task/6316388051058688?logs=main#L1389

It seems it can't parse Thu, 15 Dec 2022 08:43:15 PST.

Code seems to be here:

https://github.com/brendanhay/amazonka/blob/cfe2584aef0b03c86650372d362c74f237925d8c/lib/amazonka-core/src/Amazonka/Data/Time.hs#L105-L128

First discovered in cabal-cache: https://github.com/haskell-works/cabal-cache/issues/207

hasufell commented 1 year ago

Haskell IRC said this format can be parsed by a RFC5322 compliant parser.

endgame commented 1 year ago

Can you revert https://github.com/brendanhay/amazonka/pull/855/commits/2b0f51cff95c05e60aaa7f2c6ab7b2456ebfaae3 and see if that works for you? Then we'll know whether it's a regression or another new format which needs support.

endgame commented 1 year ago

Disregard that request; looks like amazonka's RFC822 parser expects "GMT" and no other TZ. I'll see what I can do.

hasufell commented 1 year ago

The problem with country timezone is that it's not a static UTC offset and needs tzdata or ultimately IO to look up correctly.

endgame commented 1 year ago

That's not a problem here because RFC 822/2822/5322 only define a very short list of timezones, which is hardcoded into the time library.

hasufell commented 1 year ago

That's not a problem here because RFC 822/2822/5322 only define a very short list of timezones, which is hardcoded into the time library.

It's not about the list of timezones, but that the UTC offset is mutable. E.g. north korea changed their offset from 8.5 to 9.

endgame commented 1 year ago

Please spell it out for me, because I'm not following you. #868 does not go to tzdata or anything like that, because the set of time zones supported by RFC 822 are the only ones we care about, and they do not change. The S3 model, and apparently no others, specifically says that the ResponseExpires shape is a timestamp formatted according to RFC 822:

https://github.com/boto/botocore/blob/9e9d4831d2f28a2a338406a8b16608976e034761/botocore/data/s3/2006-03-01/service-2.json#L9424-L9427

PR #868 uses the %Z format specifier in the time library, which uses a list of zone names and offsets from the time locale. Because we use defaultTimeLocale, it is the following small fixed list of timezones:

https://github.com/haskell/time/blob/1a52b804d7089eca195003b1d2847b459dd112d8/lib/Data/Time/Format/Locale.hs#L63-L74

These are the only time zone names and offsets enumerated in RFC 822, and because they were moved to obs-zone back in RFC 2822, I do not expect them to change:

 zone        =  "UT"  / "GMT"                ; Universal Time
                                             ; North American : UT
             /  "EST" / "EDT"                ;  Eastern:  - 5/ - 4
             /  "CST" / "CDT"                ;  Central:  - 6/ - 5
             /  "MST" / "MDT"                ;  Mountain: - 7/ - 6
             /  "PST" / "PDT"                ;  Pacific:  - 8/ - 7
             /  1ALPHA                       ; Military: Z = UT;
                                             ;  A:-1; (J not used)
                                             ;  M:-12; N:+1; Y:+12
             / ( ("+" / "-") 4DIGIT )        ; Local differential
                                             ;  hours+min. (HHMM)
hasufell commented 1 year ago

Well, if the RFC defines the offset, we can call it a day, I guess!

endgame commented 1 year ago

Great. Would you be able to test #868 and report back?

endgame commented 1 year ago

I double-checked botocore, and the only field that uses rfc822 serialisation for a timestamp is the ResponseExpires field of a GetObjectRequest. Which means that the datum that's causing the parse failure probably came from calling code rather than an AWS response, and explains why TZ parsing hadn't been an issue for the past ~8 years.

868 is probably correct and I think I'll leave it in, but it means that if you need a workaround because advancing your git pin is hard for some other reason, you might be able to check the timezones on times you assemble into GetObjectRequest.