djspiewak / parseback

A Scala implementation of parsing with derivatives
http://parseback.io
Apache License 2.0
197 stars 22 forks source link

Outputs of the Direct Grammar in the README are misleading #36

Open yetanotherion opened 6 years ago

yetanotherion commented 6 years ago

Executing the Direct Grammar on a worksheet

import parseback.compat.cats._
import cats.Eval

implicit val W = Whitespace("""\s+"""r)

lazy val expr: Parser[Int] = (
  expr ~ "+" ~ term ^^ { (_, e, _, t) => e + t }
    | expr ~ "-" ~ term ^^ { (_, e, _, t) => e - t }
    | term
  )

lazy val term: Parser[Int] = (
  term ~ "*" ~ factor ^^ { (_, e, _, f) => e * f }
    | term ~ "/" ~ factor ^^ { (_, e, _, f) => e / f }
    | factor
  )

lazy val factor: Parser[Int] = (
  "(" ~> expr <~ ")"
    | "-" ~ expr  ^^ { (_, _, e) => -e }
    | """\d+""".r ^^ { (_, str) => str.toInt }
  )

among others, one input of the README

expr(LineStream[Eval]("1 + 2")).value

does not parse and generates

res0: parseback.util.EitherSyntax.\/[List[parseback.ParseError],parseback.util.Catenable[Int]] = Left(List(UnexpectedCharacter(Line(1 + 2,0,0),Set(\s+))))

instead. The expression producing the expected output requires whitespaces at the beginning and the end.

expr(LineStream[Eval](" 1 + 2 ")).value
res1: parseback.util.EitherSyntax.\/[List[parseback.ParseError],parseback.util.Catenable[Int]] = Right(Single(3))

(I suspect this may be due to Note that parseback's whitespace handling is currently extremely naive. The only whitespace regular expressions which will behave appropriately are of the form .+, where . is "any single character class". Thus, \s+ is valid, as is [ \t]+, but //[^\n]*|\s+ is not. We hope to lift this restriction soon, but it requires some work on the algorithm. isn't it?)

djspiewak commented 6 years ago

Interesting! Definitely a bug. It's been a while, but I thought that whitespace handling was optional by definition. I'll look into it.

ljleb commented 4 years ago

Why not simply replacing

implicit val W = Whitespace("""\s+"""r)

with

implicit val W = Whitespace("""\s*"""r)

? (...+... => ...*...)