97jaz / gregor

Date and time library for Racket
45 stars 10 forks source link

parse-date fails to parse "191115" using pattern "yyMMdd" #35

Closed evdubs closed 4 years ago

evdubs commented 4 years ago

Here's what I see in a fresh REPL session:

> (parse-date "191115" "yyMMdd")

; Unable to match pattern [MM] against input "1115"
; Context:
;  /home/evdubs/.racket/7.4/pkgs/gregor-lib/gregor/private/exn.rkt:24:0 raise-parse-error
;  /home/evdubs/.racket/7.4/pkgs/gregor-lib/gregor/private/parse.rkt:56:0 parse-date
;  /usr/share/racket/collects/racket/repl.rkt:11:26

I think this should be parse-able?

97jaz commented 4 years ago

You're right, and I'm surprised this hasn't come up before. The problem is that the code treats MM as a pattern requiring two or more digits, when it should require exactly two. And it looks like there are other pattern parsers that have this same problem.

Thanks for the report!

evdubs commented 4 years ago

I take it the fix doesn't require too much work?

97jaz commented 4 years ago

Nope. I believe the fix is simply to add a #:max 2 to the argument list on this line.

97jaz commented 4 years ago

Well, okay, that fix would be correct according to the spec at the time that I wrote Gregor, depending on how you read it. I suppose the "one or two" text is ambiguous.

The current version is clear that MM should require two digits.

~So a better fix would be to split that into two cases. Otherwise, MM could still match a single digit.~

Er, no. The original fix is fine, since I'm already using #:min n, so M would require 1-2 digits, and MM would require exactly two, without splitting these into separate cases.

evdubs commented 4 years ago

I can confirm that adding #:max 2 to month-parse / num-parse allows "191115" to be parsed correctly. Thanks for the very quick responses.

evdubs commented 4 years ago

Do you want me to create a PR or are there other things you're checking?

97jaz commented 4 years ago

I'm on it. There are a few other patterns that have the same problem, so I want to make sure I get all of them.

evdubs commented 4 years ago

If it helps, here's another edge case I've found:

> (parse-moment "20191110 17:19:17 HST" "yyyyMMdd HH:mm:ss z")
; Unable to match pattern [MM] against input " 17:19:17 HST" [,bt for context]

So I tried doing:

> (parse-moment "20191110 17:19:17 HST" "20yyMMdd HH:mm:ss z")
; match: no matching clause for (Zone ... 'offset-name 'short) [,bt for
;   context]

Maybe HST is not being associated with Pacific/Honolulu? Is there a way for me to add that?

97jaz commented 4 years ago

The MM error in your first example is the same problem as the current one.

However, Gregor won't parse HST, and that's a much bigger issue. See https://github.com/97jaz/gregor/issues/25

97jaz commented 4 years ago

It may be that the z (short form, specific non-location) values are unique, in which case it should be possible to add support. The complication is that non all zones have a form like this and z in those cases falls back to the short localized GMT format (e.g., "GMT-8").