typelevel / cats-parse

A parsing library for the cats ecosystem
https://typelevel.org/cats-parse/
MIT License
230 stars 53 forks source link

.string works incorrectly with some parsers #600

Closed p-pavel closed 1 month ago

p-pavel commented 1 month ago
  import cats.parse.*
  val p = (Parser.string("\"\"") *> Parser.pure('"')).rep0
  val str =  "\"\"\"\""
  assert(p.string.parseAll(str) == p.map(c => String(c.toArray)))
johnynek commented 1 month ago

I don't see the problem here.

breaking it down:

val p0 = Parser.string("\"\"")
val p1 = Parser.pure('"')
val p2 = p0 *> p1
val p = p2.rep0

okay, so p0 matches exactly two quotes. p1 does not consume any input and returns the character ". p2 matches exactly two quotes and returns 1 character which is ". p will match n = 0, 2, 4, ... quotes and return a list of List[Char] with length n and all entries equal to the character " (because recall pure returns that).

Now, you do p.string.parseAll(str) where str is 4 quotes, since p can match 4 quotes, you do parse everything, but recall what p.string does: it returns the input string that was matched by p, so in this case it will be the string of 4 quotes.

finally, you say:

assert(p.string.parseAll(str) == p.map(c => String(c.toArray)))

but this is ill typed. The left hand side returns an Either[Error, String] and the right hand side returns a Parser0[String] I think. So, I think this assertion should always fail.

Lastly, I want to point out this example is monstrously hard to follow without the aid of a REPL (as I have tried to do here and may have made mistakes) due to your choice of using \". For instance, I would expect if this failed, it would have also failed with the character a which would have been much easier to read and write.

Since the assertion is ill-typed maybe that was a typo on your part. Let's assume the right hand side was supposed to call parseAll as well (p.map(c => String(c.toArray)).parseAll(str)).

Now, by coincidence of the way you wrote this, that assertion should fail I think. Why? Because though p will parse the string of 4 quotes, it will return a List('"', '"') after having done so. When you convert that to an String via String(c.toArray) you will get a string of only two " symbols.

So, I think this assertion should fail for two reasons: 1. lack of parseAll on the right hand side, 2. even if it were there the left would be a string of 4 quote symbols but the right only 2.

I may be wrong, as I said I did this all without the benefit of the repl (I don't have the code checked out on this machine). If you think I'm wrong, can you be a bit more verbose to help me understand and make it clearer where you see an assertion that is failing, and why it should pass, and also possibly printing out the left and right sides of any failed equality?

p-pavel commented 1 month ago

I don't see the problem here.

You're right. I made to mistakes indeed:

So that's entirely my fault and I'm closing the issue.

johnynek commented 1 month ago

it your defense, the name .string isn't great. Maybe p.matchedString would have been a better name for it.

p-pavel commented 1 month ago

it your defense, the name .string isn't great. Maybe p.matchedString would have been a better name for it.

Probably yes, but scaladoc is there, waiting for somebody to read it :)

Thanks for your patience.

Actually, this is my first encounter with scala-parse (not monadic parser combinators in general), I was in hurry to parse some text, so I asked chatGPT to generate something, it made this mistake and I didn't check it.

Welcome to the brave new world powered by AI produced mistakes :)

The library seems great anyway.