Closed lazebny closed 7 years ago
While I appreciate you taking a look at all the various data types, it does make each of the reported comparison values (except the first one) invalid. The speed of both Array#index
and Array#include?
is highly dependent on the underlying datatype and the efficiency of its equality check. Boolean values can be checked faster than arbitrary objects for equality -- no real surprise there. So saying Array#include?(objs)
is 100x slower than Array#include?(bool)
is interesting, but not what you want to show here, I don't think.
My numbers for comparison:
Warming up --------------------------------------
Array#include?(all) 107.082k i/100ms
Array#index(all) 105.568k i/100ms
Array#include?(bool) 235.240k i/100ms
Array#index(bool) 231.445k i/100ms
Array#include?(float)
140.949k i/100ms
Array#index(float) 145.307k i/100ms
Array#include?(int) 162.811k i/100ms
Array#index(int) 162.075k i/100ms
Array#include?(objs) 4.567k i/100ms
Array#index(objs) 4.535k i/100ms
Array#include?(str) 78.251k i/100ms
Array#index(str) 78.044k i/100ms
Array#include?(syms) 172.013k i/100ms
Array#index(syms) 166.146k i/100ms
Array#include?(arrs) 8.129k i/100ms
Array#index(arrs) 8.032k i/100ms
Calculating -------------------------------------
Array#include?(all) 1.653M (± 1.7%) i/s - 8.352M in 5.053061s
Array#index(all) 1.566M (± 1.3%) i/s - 7.918M in 5.057832s
Array#include?(bool) 6.268M (± 0.7%) i/s - 31.522M in 5.029491s
Array#index(bool) 5.491M (± 1.2%) i/s - 27.542M in 5.016865s
Array#include?(float)
2.480M (± 1.2%) i/s - 12.404M in 5.001233s
Array#index(float) 2.365M (± 1.3%) i/s - 11.915M in 5.038781s
Array#include?(int) 2.993M (± 0.8%) i/s - 14.979M in 5.004225s
Array#index(int) 2.836M (± 1.5%) i/s - 14.263M in 5.030406s
Array#include?(objs) 49.270k (± 1.2%) i/s - 246.618k in 5.006118s
Array#index(objs) 49.304k (± 0.5%) i/s - 249.425k in 5.059017s
Array#include?(str) 1.050M (± 0.8%) i/s - 5.321M in 5.068903s
Array#index(str) 1.032M (± 0.5%) i/s - 5.229M in 5.067560s
Array#include?(syms) 3.362M (± 0.8%) i/s - 16.857M in 5.014537s
Array#index(syms) 3.132M (± 0.7%) i/s - 15.784M in 5.039705s
Array#include?(arrs) 89.635k (± 0.9%) i/s - 455.224k in 5.079003s
Array#index(arrs) 86.310k (± 4.8%) i/s - 433.728k in 5.039427s
Comparison:
Array#include?(bool): 6267783.4 i/s
Array#index(bool): 5490677.5 i/s - 1.14x slower
Array#include?(syms): 3361886.2 i/s - 1.86x slower
Array#index(syms): 3132048.7 i/s - 2.00x slower
Array#include?(int): 2993402.6 i/s - 2.09x slower
Array#index(int): 2835948.2 i/s - 2.21x slower
Array#include?(float): 2480431.2 i/s - 2.53x slower
Array#index(float): 2365087.3 i/s - 2.65x slower
Array#include?(all): 1653463.3 i/s - 3.79x slower
Array#index(all): 1565691.0 i/s - 4.00x slower
Array#include?(str): 1049813.4 i/s - 5.97x slower
Array#index(str): 1031871.5 i/s - 6.07x slower
Array#include?(arrs): 89635.4 i/s - 69.93x slower
Array#index(arrs): 86309.9 i/s - 72.62x slower
Array#index(objs): 49304.3 i/s - 127.12x slower
Array#include?(objs): 49270.1 i/s - 127.21x slower
I think you want to compare pair-wise between the two Array
methods with the same datatype. When doing this, there's not much of a difference. The values for each pair are incredibly close to one another.
Unfortunately, the other problem here is the use of Array#sample
. I think I understand why you're doing it, but it means you're not doing an exact comparison between the two. I think you probably want to just generate the sample outside the benchmark loop and use that to ensure both methods being tested are subject to the same data locality and the random function doesn't add random latency to the call.
Think i do something wrong with running this benchmark because i have significant difference:
+ Array#include?(int) 605.837k (± 2.0%) i/s - 3.064M in 5.060286s
+ Array#index(int) 2.825M (± 3.0%) i/s - 14.177M in 5.023919s
It was the main reason of this comparison. I use Ubuntu 16.04:
sudo lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 2
Core(s) per socket: 2
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 58
Model name: Intel(R) Core(TM) i5-3340M CPU @ 2.70GHz
Stepping: 9
CPU MHz: 1492.488
CPU max MHz: 3400,0000
CPU min MHz: 1200,0000
BogoMIPS: 5382.20
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 3072K
NUMA node0 CPU(s): 0-3
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts
Feel free to close this PR if i am wrong.
Interesting. Can you try running just this one? I've isolated to just the int case and pulled the random call out of the hot loop. But to ensure we do use those random values, I've added iteration over the shuffled array:
require 'benchmark/ips'
ints = [*1..100]
int_samples = ints.shuffle
Benchmark.ips do |x|
x.report('Array#include?(int)') { int_samples.each { |i| ints.include?(i) } }
x.report('Array#index(int)') { int_samples.each { |i| ints.index(i) } }
x.compare!
end
I'm really curious as to what your results look like with that.
Here they are:
ruby 2.1.9p490 (2016-03-30 revision 54437) [x86_64-linux]
Warming up --------------------------------------
Array#include?(int) 621.000 i/100ms
Array#index(int) 3.760k i/100ms
Calculating -------------------------------------
Array#include?(int) 6.318k (± 4.2%) i/s - 31.671k in 5.022748s
Array#index(int) 38.544k (± 3.0%) i/s - 195.520k in 5.077354s
Comparison:
Array#index(int): 38543.6 i/s
Array#include?(int): 6318.3 i/s - 6.10x slower
Looks like it depends on ruby version:
ruby 2.3.0p0 (2015-12-25 revision 53290) [x86_64-linux]
Warming up --------------------------------------
Array#include?(int) 3.088k i/100ms
Array#index(int) 3.025k i/100ms
Calculating -------------------------------------
Array#include?(int) 30.804k (± 5.0%) i/s - 154.400k in 5.029237s
Array#index(int) 30.384k (± 1.4%) i/s - 154.275k in 5.078497s
Comparison:
Array#include?(int): 30803.7 i/s
Array#index(int): 30384.3 i/s - same-ish: difference falls within error
Ahh. That would do it. Array#include?
was improved for Ruby 2.2.0. In particular, you'll see a difference with anything that has special-handling in rb_equal_opt
, such as ints & floats.
Cool, good to know. I'll close this PR.
No problem. Thanks for submitting. And I'm glad we were able to track down the issue.
Cool findings - thanks guys!
Array#index
method logically can replaceArray#include?
in most cases. Benchmarks show that sometimes it has significantly better performance.