Genivia / RE-flex

A high-performance C++ regex library and lexical analyzer generator with Unicode support. Extends Flex++ with Unicode support, indent/dedent anchors, lazy quantifiers, functions for lex and syntax error reporting and more. Seamlessly integrates with Bison and other parsers.
https://www.genivia.com/doc/reflex/html
BSD 3-Clause "New" or "Revised" License
504 stars 85 forks source link

unicode character classes not recognized #172

Closed GorillaSapiens closed 1 year ago

GorillaSapiens commented 1 year ago

../reflex/RE-flex/src/reflex -yy --unicode --header-file=lex.yy.hpp polybasic.l polybasic.l:77: error: malformed regular expression or unsupported syntax error at position 13 [\p{Letter}\p{MiscellaneousSymbols}\p{Emoticons}\p{SupplementalSymbolsandPictog ___invalid character class

similar problems with \p{Emoticons} \p{SupplementalSymbolsandPictographs} etc... i specified %option unicode as well as --unicode on the command line. what am i doing wrong?

GorillaSapiens commented 1 year ago

looks like the forum mangled the error. \p{MiscellaneousSymbols} is what's causing the problem here

GorillaSapiens commented 1 year ago

aha, seems you need \p{IsMiscellaneousSymbols} to make it work. the documentation should probably be updated to make this more clear. it wasn't very intuitive for me as an experienced programmer but first time user of this software.

genivia-inc commented 1 year ago

Ah, I think you're right about the unnecessary confusion. The block name is IsBlockName but the table does not include the Is part. Will fix that.

genivia-inc commented 1 year ago

Online manual is updated.