onetrueawk / awk

One true awk
Other
1.98k stars 159 forks source link

Using the \w operator fails when \w is placed in the middle of an expression #174

Closed dylan-bartos-tanium closed 1 year ago

dylan-bartos-tanium commented 1 year ago

Here are a couple of working expression tests which use \w.

echo -e "word" | awk '/\w.*/{print $0}' $1
word
echo -e "word" | awk '/\word/{print $0}' $1
word

Once you put \w somewhere else in the expression matches no longer occur

echo -e "word" | awk '/wor\w/{print $0}' $1
echo -e "word" | awk '/w\w\w\w/{print $0}' $1
echo -e "word" | awk '/w\w../{print $0}' $1
echo -e "word" | awk '/.\w../{print $0}' $1
echo -e "word" | awk '/\w\w\w\w/{print $0}' $1

Attempting to use regex testers like regex101.com and regextester.com successfully match all of the 'failures' I listed above, regardless of implementation flavor (PCRE2, PCRE, Python, Goland, Java 8, .NET).

millert commented 1 year ago

awk implements POSIX extended regular expressions which don't support perl-style regular expression escapes like \w. Your examples of working expressions only match because \w matches a plain w.