Closed mohd-akram closed 8 months ago
It turns out these regular expressions result in undefined behavior according to POSIX:
*+?{ The \
, \ , \ , and \ shall be special except when used in a bracket expression (see RE Bracket Expression). Any of the following uses produce undefined results:
- If these characters appear first in an ERE, or immediately following an unescaped \
, \ , \ , or \ - If a \
is not part of a valid interval expression (see EREs Matching Multiple Characters)
It should be noted that in JavaScript's unicode regular expression mode, these regular expressions are also not supported.
Thank you for your feedback. Indeed, some forms have undefined behavior. I try to allow a few more special cases in "GNU grep" mode, i.e. when ugrep is renamed to grep (or fgrep, egrep) which auto-enables some options to be a bit more permissive with regex forms. I still like ugrep to produce an error message rather than accept undefined behavior and then decide what to make of it.
The parsing of braces is opportunistic in grep and other regular expression engines (I tried Node.js and Python). If it cannot parse a count, it parses it as a literal: