igordejanovic / parglare

A pure Python LR/GLR parser - http://www.igordejanovic.net/parglare/
MIT License
136 stars 32 forks source link

possible regression in regexp handling #139

Closed tomga closed 2 years ago

tomga commented 2 years ago

Description

I have a grammar written for earlier parglare and working properly in older versions that contain:

LINE_END: /[ ]*(#.*)?\n[ ]*/;

After upgrade to current version (this is true for 0.14 too) this line started to cause error:

GrammarError: Error at None => Regex compile error in /[ ]*(#.*)?\n[ ]*/ (report: "missing ), unterminated subpattern at position 4"):

Escaping # in regular expression fixed a problem for me. So now I have:

LINE_END: /[ ]*(\#.*)?\n[ ]*/;

that works on older and current versions.

I couldn't find any information about such change in release notes so I can't be sure that it is some unintentional regression about which you'd like to be informed. It does work for me now so if this behavior is desired then just close this issue please. In any case thank you for this great project.

igordejanovic commented 2 years ago

Thanks for reporting. Indeed, in version 0.14 I made a re.VERBOSE flag the default for regex matches and forgot to mention in the CHANGELOG. VERBOSE flag makes possible nicer writing of complex regexes which improve maintability. Basically, this means that whitespaces are ignored and # is treated as line comment if not escaped, which explains above error.

I'll fix the CHANGELOG, add the regression test and add note in the docs.

Thanks again for contributing this report.

tomga commented 2 years ago

Great that it is not a regression - it's much easier to handle on your side.

On a side note - you can consider regenerating documentation on https://www.igordejanovic.net/parglare/ as there are some broken links there that I found to be fixed on master branch on github in March 2021.