Strange issue related to the use of substring in ParserInput

kyouko-taiga commented 6 years ago

I discovered a strange issue that seems to be linked to the use of substring in ParserInput.

My apologies, I wasn't able to create a minimal example. But the issue can be observed by running the tests of my LogicKit library, with the latest changes merged in master. At line 44 of Parser.swift, I parse conjunction of terms, using infixedLeft(by:). Everything works fine with two terms, but if I attempt to parse three, the parser produces completely invalid results (In that particular example, an invalid value of the enum) that may even trigger runtime errors.

Using a String rather than a Substring for the property ParserInput.stream seems to solve the problem. Therefore, my guess is that something wrong is happening with the pointers of the substring when the parser backtracks. Unfortunately I don't have any evidence to support that claim, and I don't know how the current implementation differs from the previous one (which was based on character views).

Any idea on the source of the problem? Also, I understand some concerns were raised about the use of strings in ParserInput, as they may involve a lot of unnecessary copies. But maybe there is an intermediate solution that could rely on Swift's default copy-on-write behavior since strings are value-types? Maybe some benchmarks could shed some light on the issue.

rxwei commented 6 years ago

It's highly unlikely that it's a problem with the usage of Substring. What's your string input when you are parsing two terms? And what is it when you are parsing three?

kyouko-taiga commented 6 years ago

Copied as is directly from my tests (actually the syntax is identical to Swift's):

For two terms, the input is .var("x") && .var("y").
For three, the input is .var("x") && .var("y") && .var("z").

Parsing a single variable expression (i.e. .var("some string")) is tested successfully.

rxwei / Parsey

Strange issue related to the use of substring in ParserInput #9