Open cjdb opened 4 years ago
This is what I see on my machine:
2019-12-05 11:49:35
Running perf/range_conversion
Run on (8 X 2900 MHz CPU s)
CPU Caches:
L1 Data 32 KiB (x4)
L1 Instruction 32 KiB (x4)
L2 Unified 256 KiB (x4)
L3 Unified 8192 KiB (x1)
Load Average: 1.78, 1.95, 2.22
------------------------------------------------------------
Benchmark Time CPU Iterations
------------------------------------------------------------
Words/RangeCommon 201502562 ns 200626714 ns 7
Words/RangeTo 241609633 ns 240499000 ns 8
This is at -O3
. The effect is not as pronounced for me, but it is most definitely real. Interesting.
I expected these two implementations to have equivalent or relatively comparable performance, but Google Benchmark tells a different story when run over 476k words. Is this a perf bug or have I done something horribly wrong in making these two algorithms equivalent?