Open Lysxia opened 7 years ago
I keep needing to do this kind of thing in my attoparsec parsers, and only on discovering this bug am I shaking the feeling that I'm somehow using attoparsec wrong to need to use parseOnly within a parser so frequently.
With me, it often comes up while writing something like Parser a -> Parser b, which needs to pick out the delimited data and run the sub-parser over it.
Checking endOfInput seems like one thing that can easily be gotten wrong when doing this. The example in #95 perhaps forgot to do that. I wonder if a combinator for this should require the sub-parser to consume all the input?
A common situation is to parse an encoding prefixed by its length. So you first parse the length as an integer
n
, and then you would like to run a (sub)parserp :: Parser a
only on the nextn
bytes. I could think of two solutions for users today:Use
take
to get theByteString
and applyparseOnly p
. However, we lose source position information in case the subparser fails, and we have to keep the wholeByteString
in memory.Wrap
Parser
(e.g., with a few monad transformers) to track things like the number of bytes read; that would allow combinators like the ones I have in mind. However, this is rather heavyweight to implement. Does an existing library already offer this? I also suspect this approach would have more overhead than necessary.It would be nice for attoparsec to have combinators to delimit the input that a subparser gets to see, like
span
andsplitAt
in pipes-parse.What do you think of such an addition? Is there a better solution?