haskell-github-trust / thyme

A faster date and time library based on time
BSD 3-Clause "New" or "Revised" License
46 stars 33 forks source link

readTime and timeParser are too greedy #5

Closed tel closed 11 years ago

tel commented 11 years ago

When parsing a non-delimited time expression like YYYYMMDD formatTime properly emits it by readTime fails to parse it.

>>> z <- getCurrentTime
>>> formatTime defaultTimeLocale "%Y%m%d" z
"20130919"
>>> readTime defaultTimeLocale "%Y%m%d" "20130919"
*** Exception: not enough bytes

I believe this is because the underlying attoparsec parser is consuming the entire string while reading a number instead of limiting itself to just the first 4 digits.

liyang commented 11 years ago

I'm aware of this. If you have control over the input, please use delimiters. (ISO-8601's YYYY-mm-dd is nice... there's even an XKCD strip about it.)

It was a conscious decision on my part to avoid any assumptions about the number of digits for the year (consider the years outside of [1000,9999]...). I'll see if I can make the parser backtrack in this case, although that will probably affect performance.

tel commented 11 years ago

Unfortunately, no I'm trying to parse someone else's dates. I do think it's fairly common to write non-delimited dates, despite the speed impact. Perhaps it'd be best to have a compliant parser by default (since the String-based routes will be slow regardless) but provide a fast timeParserFast with commentary that it requires delimited dates?

Joseph

On Sat, Sep 21, 2013 at 3:28 AM, Liyang HU notifications@github.com wrote:

I'm aware of this. If you have control over the input, please use delimiters. (ISO-8601's YYYY-mm-dd is nice... there's even an XKCD strip about it.)

It was a conscious decision on my part to avoid any assumptions about the number of digits for the year (consider the years outside of [1000,9999]...). I'll see if I can make the parser backtrack in this case, although that will probably affect performance.

Reply to this email directly or view it on GitHub: https://github.com/liyang/thyme/issues/5#issuecomment-24857680

tel commented 11 years ago

I really do wish people would just use ISO though.

Joseph

On Sat, Sep 21, 2013 at 3:28 AM, Liyang HU notifications@github.com wrote:

I'm aware of this. If you have control over the input, please use delimiters. (ISO-8601's YYYY-mm-dd is nice... there's even an XKCD strip about it.)

It was a conscious decision on my part to avoid any assumptions about the number of digits for the year (consider the years outside of [1000,9999]...). I'll see if I can make the parser backtrack in this case, although that will probably affect performance.

Reply to this email directly or view it on GitHub: https://github.com/liyang/thyme/issues/5#issuecomment-24857680