weltling / parle

Parser and lexer for PHP
Other
82 stars 9 forks source link

PCRE Compatibility #9

Closed BenHanson closed 6 years ago

BenHanson commented 6 years ago

lexertl does not support all of the PCRE syntax. The syntax it supports beyond flex is Unicode Character Classes and non-greedy repeats.

weltling commented 6 years ago

Yeah, thanks for the note. Of course it's a subset only. Realized that already, when trying to use a non capturing match, but of course many other things are not supported.

Unicode character classes seem to be not supported with 8-bit mode, but UTF-32 was fine. I've used char based functionality only till now, but supporting UTF-32 is of course not off the table. Probably another set of PHP classes would make sense to be mapped, to make use of the corresponding templates.

I've mapped the options from http://www.benhanson.net/lexertl.html for the docs in the regex matching chapter here https://svn.php.net/viewvc/phpdoc/en/trunk/reference/parle/pattern.matching.xml?view=markup, only excluded the UTF-32 dependent cases. The manual is not yet online, but will be sometime next week. What i meant was rather "PCRE compatible syntax", but probably it's better to not to mention PCRE at all. I've pushed some correction here c2fc9be5c3bd9966f1a56dca72cd0c03770adc24 to avoid the confusion for now and will mention the exact manual link later when it's online, so there should be a place with a complete information.

Thanks.

remicollet commented 6 years ago

will mention the exact manual link later when it's online

http://php.net/parle ;)

weltling commented 6 years ago

Oh, seems it appeared fast. Probably it's now built more frequently, was only once a week earlier. Anyway, Added the link now, so seems we're done :)

Thanks.