Open cjdb opened 10 months ago
The regex implementation available in lib/Support
seems to support only POSIX-style regex'es. So, one could use [:digit:]
instead of \d
Here is list of supported classes: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Support/regcomp.c#L58
[:digit:]
is longer than [0-9]
which already interrupts readability when compared with \d
. Further, these types of escapes are accepted by a large variety of regex engines, and it was surprising to learn that FileCheck doesn't support this (I spent a couple of hours debugging before swapping out \d
with [0-9]
).
If POSIX regex doesn't support this, then we should consider expanding to a style that supports both [:digit:]
and \d
.
clang\d
is a pattern that's recognised by many regex engines to meanclang[0-9]
, but FileCheck doesn't seem to recognise it. It would be good to have FileCheck recognise the following patterns:\f
,\n
,\r
,\t
,\v
: usual escape sequences\b
: matches a word boundary\d
: equivalent to[0-9]
\s
: equivalent to[ \f\n\r\t\v]
\w
: equivalent to[A-Za-z0-9_]
\B
: inverse of\b
\S
: inverse of\s
\D
: inverse of\d
\W
: inverse of\w
The above are good for matching ASCII characters, but don't scale for anything that's outside of ASCII. If we're to add this feature, I think it would be good to produce a design that incorporates Unicode code points as well.