Closed harendra-kumar closed 3 years ago
No need to escape slashes:
> MYBENCH=`cabal list-bin streamly-benchmarks:Unicode.Stream`
> $MYBENCH -p '$0 == "All.Unicode.Stream/o-1-space.ungroup-group.unlines . splitOnSuffix ([Word8]) (1/10)"'
Using small input file: benchmark-tmp/in-10MB.txt
Using big input file: benchmark-tmp/in-100MB.txt
Using output file: benchmark-tmp/out.txt
All
Unicode.Stream/o-1-space
ungroup-group
unlines . splitOnSuffix ([Word8]) (1/10): OK (2.13s)
299 ms ± 26 ms
In a more complex cases a sed
incantation for correct escaping could be useful:
$MYBENCH -l | sed -e 's/[\"]/\\\\\\&/g' | while read name; do $MYBENCH -p '$0 == "'$name'"'; done
Parsing patterns is completely up to tasty
. It would be nice to get a more specific parsing error indeed.
It works if the benchmark name is specified directly on the command line as you did. But it does not work if the benchmark name is expanded from a shell variable. For example:
$ MYBENCH=`cabal list-bin streamly-benchmarks:Unicode.Stream`
$ BENCH_NAME="All.Unicode.Stream/o-1-space.ungroup-group.unlines . splitOnSuffix ([Word8]) (1/10)"
$ $MYBENCH -p '$0 == "'$BENCH_NAME'"'
Using small input file: benchmark-tmp/in-10MB.txt
Using big input file: benchmark-tmp/in-100MB.txt
Using output file: benchmark-tmp/out.txt
option -p: Could not parse pattern
Usage: Unicode.Stream [-p|--pattern PATTERN] [-t|--timeout DURATION]
[-l|--list-tests] [-j|--num-threads NUMBER] [-q|--quiet]
[--hide-successes] [--color never|always|auto]
[--ansi-tricks ARG] [--baseline ARG] [--csv ARG]
[--svg ARG] [--stdev ARG] [--fail-if-slower ARG]
[--fail-if-faster ARG]
I tried escaping as well but it does not work. On trial and error I found that if there is a space char in the benchmark name then the "Could not parse pattern" error comes.
If I remove spaces from the benchmark names then it seems to work fine:
$ BENCH_NAME='All.Unicode.Stream/o-1-space.ungroup-group.US.unlines.S.splitOnSuffix([Word8])(1/10)'
$ $MYBENCH -p '$0 == "'$BENCH_NAME'"'
Using small input file: benchmark-tmp/in-10MB.txt
Using big input file: benchmark-tmp/in-100MB.txt
Using output file: benchmark-tmp/out.txt
All
Unicode.Stream/o-1-space
ungroup-group
US.unlines.S.splitOnSuffix([Word8])(1/10): OK (0.61s)
192 ms ± 6.5 ms
All 1 tests passed (0.61s)
But this means we will have to change names of hundreds of benchmarks. Is this a bug? Is there a workaround for this?
Ok, shell escaping is really hard to understand. This one worked even with spaces in the benchmark name:
$ $MYBENCH -p '$0 == "'"$BENCH_NAME"'"'
Using small input file: benchmark-tmp/in-10MB.txt
Using big input file: benchmark-tmp/in-100MB.txt
Using output file: benchmark-tmp/out.txt
All
Unicode.Stream/o-1-space
ungroup-group
US.unlines.S.splitOnSuffix([Word8]) (1/10): OK (0.60s)
191 ms ± 4.6 ms
All 1 tests passed (0.60s)
@Bodigrim you may want to change your sed incantation example so that it works even with spaces in the name.
Ah, interesting, https://github.com/Bodigrim/tasty-bench/issues/18#issuecomment-841743779 works in my shell as is, because I happened to use zsh
. If I switch to bash
, it fails with a parse error. Shell escaping is hard indeed.
Thanks, updated escaping in ab49d86e06c26aa417fe58067d43f748f8bdc0c2.
Benchmark names containing double quotes are still giving me trouble. It gives the same error "Could not parse pattern". I tried escaping double quotes with a backslash, but did not work.
Ok, escaping double quotes actually worked, earlier I actually did not use the escaped shell value, my bad.
The last issue was with benchmark names using backslashes e.g. "splitOn \n". I had to use the shell read -r
instead of read
(otherwise it eats the backslashes) and then escape the backslashes using another backslash in the pattern.
You may want to use read -r
in your example to preserve the backslashes from the input.
Ha, running ShellCheck on my original snippet spots both issues:
In shell.sh line 4:
$MYBENCH -l | sed -e 's/[\"]/\\\\\\&/g' | while read name; do $MYBENCH -p '$0 == "'$name'"'; done
^--^ SC2162: read without -r will mangle backslashes.
^-------^ SC2016: Expressions don't expand in single quotes, use double quotes for that.
^---^ SC2086: Double quote to prevent globbing and word splitting.
Did you mean:
$MYBENCH -l | sed -e 's/[\"]/\\\\\\&/g' | while read name; do $MYBENCH -p '$0 == "'"$name"'"'; done
Nice! I forgot about shellcheck, could have saved me some time.
I have a benchmark named
All.Unicode.Stream/o-1-space.ungroup-group.US.unlines . S.splitOnSuffix ([Word8]) (1/10)
. I am able to substring match the benchmark as follows:But I need an exact match instead of substring match so I use an awk pattern like this:
It fails with
option -p: Could not parse pattern
. Can someone tell me what's wrong with this? Is there any way some debug information can be printed which can tell why it could not parse pattern?