godotengine / godot-proposals

Godot Improvement Proposals (GIPs)
MIT License
1.14k stars 94 forks source link

Add an optional flags parameter to `RegEx.compile()` #2927

Open MikeSchulze opened 3 years ago

MikeSchulze commented 3 years ago

Describe the project you are working on

Markdown to BBCode converter

Describe the problem or limitation you are having in your project

No particular limitation for now, but it was hard to achive this because by lack of documentation. I was trying to parse multiline text by regex and I was strugling. The full story can found here

Describe the feature / enhancement and how it helps to overcome the problem or limitation

It would be nice to extend the compile function of RegEx class by optional flags e.g.

RegEx flags to control the parsing

Flag | value | Description -- | -- | -- IGNORECASE | 1 | ignore case. MULTILINE | 2 | make begin/end {^, $} consider each line. DOTALL | 4 | make . match newline too. UNICODE | 8 | make {\w, \W, \b, \B} follow Unicode rules. LOCALE | 16 | make {\w, \W, \b, \B} follow locale. VERBOSE | 32 | allow comment in regex.

function signature

Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams

Use the flags to configure regex to parse on multilines

    var regex := RegEx.new()
    var err = regex.compile("^### (.*)$", RegEx.MULTILINE)

Is there a reason why this should be core and not an add-on in the asset library?

This is part of the built-in regex module.

nathanfranke commented 3 years ago

Workaround:

https://www.pcre.org/current/doc/html/pcre2syntax.html#SEC16

  (?i)            caseless
  (?J)            allow duplicate named groups
  (?m)            multiline
  (?n)            no auto capture
  (?s)            single line (dotall)
  (?U)            default ungreedy (lazy)
  (?x)            extended: ignore white space except in classes
  (?xx)           as (?x) but also ignore space and tab in classes
  (?-...)         unset option(s)
  (?^)            unset imnsx options
MikeSchulze commented 3 years ago

@nathanfranke yes that works. I currently use this aproach e.g. (?m)^##### (.*)