doctrine / lexer

Base library for a lexer that can be used in Top-Down, Recursive Descent Parsers.
https://www.doctrine-project.org/projects/lexer.html
MIT License
11.07k stars 60 forks source link

improves parser multi byte string #36

Closed Defender32 closed 4 years ago

Defender32 commented 4 years ago

issue https://github.com/doctrine/annotations/issues/265 pull request https://github.com/doctrine/annotations/pull/290

Class Lexer not correct parsing multi byte string

alcaeus commented 4 years ago

@Defender32 to me this looks good - is anyone in @doctrine/doctrinecore aware of any BC breaks that this can lead to?

Not aware of any that we care about. With u, \w will also match Unicode characters, which it previously didn't. People that were relying on \w not catching these will have their stuff broken, but on the other hand whoever is creative enough to exploit this behaviour should be creative enough to deal with the breakage induced by fixing this.

guilhermeblanco commented 4 years ago

This seems like an ok change to me too. It should not impact any of our known derivatives, since we use "u" by default in all.

Merging.