Genivia / RE-flex

A high-performance C++ regex library and lexical analyzer generator with Unicode support. Extends Flex++ with Unicode support, indent/dedent anchors, lazy quantifiers, functions for lex and syntax error reporting and more. Seamlessly integrates with Bison and other parsers.
https://www.genivia.com/doc/reflex/html
BSD 3-Clause "New" or "Revised" License
504 stars 85 forks source link

how to using split method? #157

Closed dangdkhanh closed 1 year ago

dangdkhanh commented 1 year ago

Hi, Is there any way to use matcher.split() but for groups instead match,

Thanks you.

genivia-inc commented 1 year ago

Grouping support in the Matcher class is limited to outer closing (...). You want to use the PerlMatcher to perform grouping. Or what is your question, exactly?

dangdkhanh commented 1 year ago

Hi, can I make matcher.split() work in this case? The code I'm using is as follows:

    reflex::PCRE2Matcher matcher("brown(\\d+)", "How now45 brown cow now brown123 cow now brown378.");
    while (matcher.split() != 0)
    std::cout << "Found " << matcher.text() << std::endl;

my expected result is: "How now45 brown cow now brown"
" cow now brown378."

genivia-inc commented 1 year ago

You will get:

Found How now45 brown cow now 
Found  cow now 
Found .

because both brown123 and brown378 are the "split points" that match the pattern. Split returns text between the pattern matches.

dangdkhanh commented 1 year ago

hi @genivia-inc genivia, my omission, so my expected result is: How now45 brown cow now brown cow now brown I mean there is one more option separated by group in match which is (\d+)

genivia-inc commented 1 year ago

That's not how splitting works. Remember that split returns the text between the pattern matches. The pattern is brown(\\d+) which matches brown123 and brown378. To split on numbers only, use the pattern \\d+. To split on numbers only after the word brown, use perl matching with a so-called lookbehind pattern to also match brown but not consume it as part of the pattern.

dangdkhanh commented 1 year ago

Hi @genivia, please!!! that will make the split more flexible.