haskell / attoparsec

A fast Haskell library for parsing ByteStrings
http://hackage.haskell.org/package/attoparsec
Other
512 stars 93 forks source link

`decimal` fails at end of input #121

Closed TomMD closed 8 years ago

TomMD commented 8 years ago

This seems so very likely to happen to anyone that I suspect I'm going to be told it is "not a bug". If that happens I'm prepared to argue that such behavior should indeed be considered a bug or mis-feature and really needs documented.

*Main> eitherResult (parse decimal "12")
Left "Result: incomplete input"
*Main> eitherResult (parse decimal "12 ")
Right 12

Owch!

oh, and:

   attoparsec-0.13.0.1
bgamari commented 8 years ago

Hmmm, this is quite unfortunate indeed. What do you propose we say in the documentation?

It should be noted that the result that you observe is due to the fact that you used parse, which allows for further content to be provided, and then retrieved the Result with eitherResult, which treats partial parses as errors. If you had used parseOnly or fed the parser the usual end-of-input empty chunk you would have observed the result that you expect,

λ> parseOnly decimal "12"
Right 12
λ> eitherResult $ feed (parse decimal "12") mempty
Right 12
TomMD commented 8 years ago

@bgamari Ah, I missed that, while parse is sparsely documented the {maybe,either}Result includes the statement.

So now I'm in the awkward situation where I simply disagree with the default choice that partial parses should be failures with maybe/either result instead of fed empty first. Is there some deeper reason behind preferring failure on the partial parse? It just seems awkward that there exists no simple (1 or two function applications) solution to parse a value that might or might not consume the entire input.

bgamari commented 8 years ago

@TomMD, what is wrong with parseOnly? I believe it is intended for precisely this case.

I think the rationale here was that eitherResult and maybeResult should really just be projections of an existing Result. They should not do any further parsing that the user did not request. That being said, it's not entirely clear to me whether feeding the parser an empty chunk should actually be considered "further parsing".

TomMD commented 8 years ago

@bgamari Sorry about the noise. I had a misconception in my mind that parseOnly sequenced endOfInput - requiring the full string to be parsed. Not sure how that misconception came into being, but that drastically changed my view of its use and is responsible for this perceived API 'hole'.