Genivia / RE-flex

A high-performance C++ regex library and lexical analyzer generator with Unicode support. Extends Flex++ with Unicode support, indent/dedent anchors, lazy quantifiers, functions for lex and syntax error reporting and more. Seamlessly integrates with Bison and other parsers.
https://www.genivia.com/doc/reflex/html
BSD 3-Clause "New" or "Revised" License
504 stars 85 forks source link

\s matches \n ? #171

Closed tlemo closed 1 year ago

tlemo commented 1 year ago

Contrary to the documentation, it seems that \s is matching \n (at least when used in a lexer specification), is this intentional?

genivia-inc commented 1 year ago

In a lexer specification \s matches \n. This the usual regex interpretation of \s.

There is a typo in the table in section "Character categories" (will fix that), but the other table in section "Unicode mode" is correct.

Ugrep makes an exception to make it behave like grep. The RE/flex regex converter has a flag reflex::convert_flag::notnewline to override \s to ignore \n, which is used by ugrep.

genivia-inc commented 1 year ago

The online manual is updated.