beyondgrep / website

The source code for the beyondgrep.com website
https://beyondgrep.com/
37 stars 20 forks source link

Add "can it match using a regex which contains \0" #76

Open avar opened 6 years ago

avar commented 6 years ago

Since there's no way to pass a NULL character on the command-line, this needs to be done via the tool's ability to read a pattern from a file. Not all matchers can do this.

GNU grep can do it:

$ perl -wE 'say "foo\0bar"' >file; perl -wE 'say "f.*\0[b].*a"' >p; grep -a -f p file; echo $?
foobar
0

Note that some matchers look like they can do it, but they just discard anything on the RHS of the \0, e.g. pcre2grep (even though pcre2 itself supports this):

$ perl -wE 'say "foo\0bar"' >file; perl -wE 'say "f.*\0[x].*a"' >p; ./pcre2grep -a -f p file; echo $?
foobar
0

This match should fail, but doesn't. GNU grep is not fooled by this.

BurntSushi commented 6 years ago

ripgrep can do it as well:

$ xxd /tmp/corpus
00000000: 666f 6f00 6261 720a                      foo.bar.
$ cat /tmp/pattern
\x00
$ rg -a -f  /tmp/pattern /tmp/corpus
1:foobar

You actually don't even need to use a separate file for ripgrep, since you can utter a NUL byte in ripgrep's native regex syntax, e.g.,

$ rg -a '\u{0}' /tmp/corpus
1:foobar