SOF3 / pharynx

A tool to recompile PHP sources into a phar in PSR-0
Apache License 2.0
17 stars 4 forks source link

Shorten VirionProcessor's regex check #14

Closed NTT1906 closed 1 year ago

NTT1906 commented 1 year ago

Old original: https://rubular.com/r/rbFHtoKeXfTOIb ^[a-zA-Z0-9_]+(\\\\[a-zA-Z0-9_]+)+$ Group-based: https://rubular.com/r/n97pophwFTVzlf ^(?:[\w]+(?:\\\\)?)+$, this allows tes_1t\\dir\\ Currently using: https://rubular.com/r/u2chZTVA19hZKj ^\w+(?:\\\\\w+)+$, this allows tes_1t\\dir

if (!preg_match('#^\w+(?:\\\\\w+)+$#', $antigen, $matches)) {
          echo "\"$antigen\" is not a valid class name";
          exit(1);
      }
      if (str_ends_with($epitope, "\\")) { // would this not be executed in both original and current regex?
          throw new RuntimeException("epitope must not end with a backslash");
      }
      ...
}
SOF3 commented 1 year ago

I am not really a fan of using character classes. They differ across implementations and actually reduce readability since it is not always obvious to remember what they mean (are they just alphabets? does w mean "whitespace" instead of "word"?). I would rather not use them unless they significantly reduce the regex complexity like \b.

SOF3 commented 1 year ago

and the ?: doesn't look necessary anyway

NTT1906 commented 1 year ago

then ^([a-zA-Z0-9_]+(\\\\)?)+$?

SOF3 commented 1 year ago

Trailing backslashes are explicitly disallowed. That's the main point of this regex check.

NTT1906 commented 1 year ago

Trailing backslashes are explicitly disallowed. That's the main point of this regex check.

if (str_ends_with($epitope, "\\")) {
throw new RuntimeException("epitope must not end with a backslash");
}

if so, this code won't be executed?

SOF3 commented 1 year ago

Right, those lines are redundant indeed, but let's just leave it there