Closed tenderlove closed 10 months ago
The benchmark looks good 👍
One thing that's a bit confusing to me is that based on stats:
./run_once.sh --yjit-stats benchmarks/blurhash/benchmark.rb
...
total_exits: 0
Top-20 most frequent C calls (97.2% of C calls):
Float#/: 3,498,099 (19.4%)
Float#*: 3,001,055 (16.7%)
Integer#*: 2,996,283 (16.7%)
Float#+: 2,996,251 (16.7%)
Float#**: 1,498,162 ( 8.3%)
Integer#to_f: 1,498,155 ( 8.3%)
Float#<=: 1,498,147 ( 8.3%)
Module#cos: 501,782 ( 2.8%)
Symbol#to_s: 445 ( 0.0%)
Symbol#start_with?: 402 ( 0.0%)
Array#fetch: 40 ( 0.0%)
Symbol#match: 33 ( 0.0%)
String#setbyte: 28 ( 0.0%)
The setbyte
function is only called 28 times. I guess it's only used to encode a short hash at the end, and not actually the whole image?
If anything, this benchmark makes me think we ought to pay more attention to floating-point support 😅 Still a good benchmark and inclined to merge it 👍
The
setbyte
function is only called 28 times. I guess it's only used to encode a short hash at the end, and not actually the whole image?
Ya, as I said in the description I think this is probably not a good benchmark for setbyte
but rather floating point math. We should probably do some work on floating point math, but I'll look for a better setbyte
benchmark
No stress, probably best to look for relevant benchmarks, pure-Ruby code rather than benchmarks specifically using setbyte.
I think we're trying to find a benchmark that would measure the performance of
setbyte
,
FWIW I recall https://github.com/oracle/truffleruby/issues/2336#issuecomment-826722788.
It actually uses String#[]=
and not setbyte
(on a BINARY or US-ASCII String from a quick look).
But as mentioned in that comment I think using a String like a byte buffer is rather inefficient (notably it can cause needless coderange computations).
This is a benchmark for a pure Ruby version of the Blurhash gem:
https://github.com/Gargron/blurhash
I think we're trying to find a benchmark that would measure the performance of
setbyte
, but I'm not 100% convinced this is the right benchmark. It only callssetbyte
a handful of times per iteration. I think this benchmark might be more suitable for floating point math benchmarks?Regardless, I think we should add this benchmark and try to speed it up. The C extension version of this code is 10x faster than when YJIT is enabled: