nitely / nim-regex

Pure Nim regex engine. Guarantees linear time matching
https://nitely.github.io/nim-regex/
MIT License
225 stars 20 forks source link

Error: Empty group is not allowed #101

Closed artemklevtsov closed 3 years ago

artemklevtsov commented 3 years ago

PCRE seems allow it.

To reproduce:

import regex

const s = r"(TXT1)/TXT2()\.(\d+)"
let rgs = re(s)

The same with re is ok.

Can I skip or ignore that?

nitely commented 3 years ago

I can remove that restriction to keep compatibility with PCRE.

That said, I think empty groups make no sense. Are you generating that regex or something? if it's hand written, then why would you need an empty group?

artemklevtsov commented 3 years ago

Thank you for the quick reply. I tried to compile the ua-parser/uap-core/regexes.yaml list. All done with the re module.

artemklevtsov commented 3 years ago

uap compares groups position with replacements. So we can't simply skip the group.

nitely commented 3 years ago

Oh, that sounds like a valid use case. I'll fix it then.

artemklevtsov commented 3 years ago

Seems it not works yet.

Code to reproduce:

import os
import regex

# nimble install -y regex@#head
# curl -s https://raw.githubusercontent.com/ua-parser/uap-core/master/regexes.yaml | grep 'regex:' | cut -d ':' -f 2 | tr -d ' ' > /tmp/regexes.txt

let dst = "/tmp/regexes.txt"

if dst.fileExists():
  for l in dst.lines():
    let rgx = re(l)

Result:

nim compile --verbosity:0 --hints:off --run "/tmp/tre.nim"  
/tmp/tre.nim(10, 16) Warning: use readLines with two arguments; readLines is deprecated [Deprecated]
/tmp/tre.nim(11) tre
/home/unikum/.nimble/pkgs/regex-0.19.0/regex.nim(283) re
/home/unikum/.nimble/pkgs/regex-0.19.0/regex/compiler.nim(13) reImpl
/home/unikum/.nimble/pkgs/regex-0.19.0/regex/parser.nim(752) parse
/home/unikum/.nimble/pkgs/regex-0.19.0/regex/parser.nim(666) subParse
/home/unikum/.nimble/pkgs/regex-0.19.0/regex/parser.nim(652) parseGroupTag
/home/unikum/.nimble/pkgs/regex-0.19.0/regex/parser.nim(49) check
Error: unhandled exception: Invalid group. Unknown group type
~8 chars~ntServer)(\d+)(?
                       ^ [RegexError]
Error: execution of an external program failed: '/tmp/tre '
nitely commented 3 years ago

I see you no longer trust nim-regex :P, but the issue is in the cut command. I tried curl -s https://raw.githubusercontent.com/ua-parser/uap-core/master/regexes.yaml | grep -oP "(?<=regex: ').+(?=')" > /tmp/regexes.txt and your snippet compiles.

artemklevtsov commented 3 years ago

My bad, sorry. I forgot about around quotes. Now this list compiles successful. Thanks again.