Genivia / RE-flex

A high-performance C++ regex library and lexical analyzer generator with Unicode support. Extends Flex++ with Unicode support, indent/dedent anchors, lazy quantifiers, functions for lex and syntax error reporting and more. Seamlessly integrates with Bison and other parsers.
https://www.genivia.com/doc/reflex/html
BSD 3-Clause "New" or "Revised" License
504 stars 85 forks source link

Subscripts not working correctly? #194

Closed RuudRietvink closed 10 months ago

RuudRietvink commented 10 months ago

rational [0-9]*([⅒⅑⅛⅐⅙⅕¼⅓½⅖⅔⅜⅗¾⅘⅝⅚⅞]|([⁰¹²³⁴⁵⁶⁷⁸⁹]+\/[₀₁₂₃₄₅₆₇₈₉]+))

This tokenizes ³³/₄₅₆ to ³³/₄

[EDIT] Works when I replace \/ with [/] so likely nothing to do with unicode but with / operator?

genivia-inc commented 10 months ago

The / operator in Lex and Flex and RE/flex is called "trailing context" and is essentially a lookahead. A pattern matches if the lookahead matches also, but the lookahead is removed from the match.