elm / parser

A parsing library, focused on simplicity and great error messages
https://package.elm-lang.org/packages/elm/parser/latest
BSD 3-Clause "New" or "Revised" License
230 stars 46 forks source link

Omissions in comparison with prior work #31

Closed zenhack closed 5 years ago

zenhack commented 5 years ago

The comparison with prior work section in the docs starts of with:

I have not seen the parser pipeline or the context stack ideas in other libraries, but backtracking relate to prior work.

But it seems like there's actually a lot of prior work. (|=) and (|.) seem to have exactly the same semantics Haskell's (<*>) and (<*). To translate the example into Haskell & Parsec:

import Text.ParserCombinators.Parsec

data Point = Point
    { x :: Float
    , y :: Float
    }
    deriving(Show)

point :: Parser Point
point =
    pure Point
        <* string "("
        <* spaces
        <*> float
        <* spaces
        <* string ","
        <* spaces
        <*> float
        <* spaces
        <* string ")"

-- Not part of the original example, but I wanted this to be runnable:
float :: Parser Float
float =
    fmap read
        ( pure (++)
            <*> many digit
            <*> choice
                [ pure (:)
                    <*> char '.'
                    <*> many digit
                , pure ""
                ]
        )

Parsec also defines an operator for adding context information:

https://hackage.haskell.org/package/parsec-3.1.13.0/docs/Text-ParserCombinators-Parsec-Prim.html#v:-60--63--62-

...it looks like Elm's version of this is much more sophisticated, but it might be worth discussing how it improves on things like <?>.

evancz commented 5 years ago

I have never seen that style used in any code written before this library was published. Nor have I seen that style promoted anywhere before this library was published. I think a link to parsec docs promoting this style would be an example of prior work.

zenhack commented 5 years ago

I would probably use f <$> ... instead of pure f ..., but this goes way back; real world haskell was talking about using applicatives like this in 2008 (the interesting bits start at "Applicative functors for parsing"):

http://book.realworldhaskell.org/read/using-parsec.html

That's probably pretty close to where the style originated; there's this at the end of the section:

As we write this book, applicative functors are still quite new to Haskell, and people are only beginning to explore the possible uses for them beyond the realm of parsing.

And of course rwh was a standard learning resource before it succumbed to bitrot.

There is of course the usual Elm/Haskell split on how to go about indenting code, and the Haskell tends to be much more irregular, mixing in plenty of other operators, but besides the code formatting choices, this is pretty mainstream. I write code like this all the time.

evancz commented 5 years ago

I agree that the types have been written down before, but I do not agree that means they invented the pipeline style. The resource you provide obviously does not use the pipeline style, and the point I'm trying to make is that "people did not write code in this style before." I think the fact that people wrote down the types before but still didn't do things in this style before is evidence in favor of that argument.

The <?> operator is a way to make the parser fail. Instead of saying "it seems like one of these N things happened" you can say "expecting X". This is not related to adding context at all. It is just about tweaking the specific message when you reach a failure.

I think it's fair to say that I thought of |= and |. having previously seen the Applicative class and such, but I don't think the document is claiming that I invented applicatives. It just says "I have not seen the parser pipeline ... in other libraries." Maybe adding the word "style" makes it less ambiguous, but I would like to wait and see if independent folks share the same concern as you before changing things.

zenhack commented 5 years ago

Re: <?>, see this paragraph from the doc:

Once you get comfortable with the Parser module, you can switch over to Parser.Advanced and use inContext to track exactly what your parser thinks it is doing at the moment. You can let the parser know “I am trying to parse a "list" right now” so if an error happens anywhere in that context, you get the hand annotation!

(Emphasis mine). The emphasized bit seems like exactly what <?> does, though it's obvious that this libray's facilities are richer and extend beyond that. Re: "The <?> operator is a way to make the parser fail", I'm not sure what you mean? <?> only affects the messages, not whether or not the parser fails.

I brought up the bit about applicatives originally because I found the lack of mention genuinely confusing -- I spent a long time staring at it trying to understand what difference I was missing about the semantics; stating that the difference was a matter of style would have made it much clearer to me (especially since I tend to gloss over that when comparing Haskell vs. Elm libraries; norms around indentation are so different in general).

...but I won't belabor the point anymore.