fastruby / fast-ruby

:dash: Writing Fast Ruby :heart_eyes: -- Collect Common Ruby idioms.
https://github.com/fastruby/fast-ruby
5.67k stars 376 forks source link

Add Array.include? vs Array.index #128

Closed lazebny closed 7 years ago

lazebny commented 7 years ago

Array#index method logically can replace Array#include? in most cases. Benchmarks show that sometimes it has significantly better performance.

nirvdrum commented 7 years ago

While I appreciate you taking a look at all the various data types, it does make each of the reported comparison values (except the first one) invalid. The speed of both Array#index and Array#include? is highly dependent on the underlying datatype and the efficiency of its equality check. Boolean values can be checked faster than arbitrary objects for equality -- no real surprise there. So saying Array#include?(objs) is 100x slower than Array#include?(bool) is interesting, but not what you want to show here, I don't think.

My numbers for comparison:

Warming up --------------------------------------
 Array#include?(all)   107.082k i/100ms
    Array#index(all)   105.568k i/100ms
Array#include?(bool)   235.240k i/100ms
   Array#index(bool)   231.445k i/100ms
Array#include?(float)
                       140.949k i/100ms
  Array#index(float)   145.307k i/100ms
 Array#include?(int)   162.811k i/100ms
    Array#index(int)   162.075k i/100ms
Array#include?(objs)     4.567k i/100ms
   Array#index(objs)     4.535k i/100ms
 Array#include?(str)    78.251k i/100ms
    Array#index(str)    78.044k i/100ms
Array#include?(syms)   172.013k i/100ms
   Array#index(syms)   166.146k i/100ms
Array#include?(arrs)     8.129k i/100ms
   Array#index(arrs)     8.032k i/100ms
Calculating -------------------------------------
 Array#include?(all)      1.653M (± 1.7%) i/s -      8.352M in   5.053061s
    Array#index(all)      1.566M (± 1.3%) i/s -      7.918M in   5.057832s
Array#include?(bool)      6.268M (± 0.7%) i/s -     31.522M in   5.029491s
   Array#index(bool)      5.491M (± 1.2%) i/s -     27.542M in   5.016865s
Array#include?(float)
                          2.480M (± 1.2%) i/s -     12.404M in   5.001233s
  Array#index(float)      2.365M (± 1.3%) i/s -     11.915M in   5.038781s
 Array#include?(int)      2.993M (± 0.8%) i/s -     14.979M in   5.004225s
    Array#index(int)      2.836M (± 1.5%) i/s -     14.263M in   5.030406s
Array#include?(objs)     49.270k (± 1.2%) i/s -    246.618k in   5.006118s
   Array#index(objs)     49.304k (± 0.5%) i/s -    249.425k in   5.059017s
 Array#include?(str)      1.050M (± 0.8%) i/s -      5.321M in   5.068903s
    Array#index(str)      1.032M (± 0.5%) i/s -      5.229M in   5.067560s
Array#include?(syms)      3.362M (± 0.8%) i/s -     16.857M in   5.014537s
   Array#index(syms)      3.132M (± 0.7%) i/s -     15.784M in   5.039705s
Array#include?(arrs)     89.635k (± 0.9%) i/s -    455.224k in   5.079003s
   Array#index(arrs)     86.310k (± 4.8%) i/s -    433.728k in   5.039427s

Comparison:
Array#include?(bool):  6267783.4 i/s
   Array#index(bool):  5490677.5 i/s - 1.14x  slower
Array#include?(syms):  3361886.2 i/s - 1.86x  slower
   Array#index(syms):  3132048.7 i/s - 2.00x  slower
 Array#include?(int):  2993402.6 i/s - 2.09x  slower
    Array#index(int):  2835948.2 i/s - 2.21x  slower
Array#include?(float):  2480431.2 i/s - 2.53x  slower
  Array#index(float):  2365087.3 i/s - 2.65x  slower
 Array#include?(all):  1653463.3 i/s - 3.79x  slower
    Array#index(all):  1565691.0 i/s - 4.00x  slower
 Array#include?(str):  1049813.4 i/s - 5.97x  slower
    Array#index(str):  1031871.5 i/s - 6.07x  slower
Array#include?(arrs):    89635.4 i/s - 69.93x  slower
   Array#index(arrs):    86309.9 i/s - 72.62x  slower
   Array#index(objs):    49304.3 i/s - 127.12x  slower
Array#include?(objs):    49270.1 i/s - 127.21x  slower

I think you want to compare pair-wise between the two Array methods with the same datatype. When doing this, there's not much of a difference. The values for each pair are incredibly close to one another.

Unfortunately, the other problem here is the use of Array#sample. I think I understand why you're doing it, but it means you're not doing an exact comparison between the two. I think you probably want to just generate the sample outside the benchmark loop and use that to ensure both methods being tested are subject to the same data locality and the random function doesn't add random latency to the call.

lazebny commented 7 years ago

Think i do something wrong with running this benchmark because i have significant difference:

+ Array#include?(int)    605.837k (± 2.0%) i/s -      3.064M in   5.060286s
+    Array#index(int)      2.825M (± 3.0%) i/s -     14.177M in   5.023919s

It was the main reason of this comparison. I use Ubuntu 16.04:

sudo lscpu

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    2
Core(s) per socket:    2
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 58
Model name:            Intel(R) Core(TM) i5-3340M CPU @ 2.70GHz
Stepping:              9
CPU MHz:               1492.488
CPU max MHz:           3400,0000
CPU min MHz:           1200,0000
BogoMIPS:              5382.20
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              3072K
NUMA node0 CPU(s):     0-3
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts

Feel free to close this PR if i am wrong.

nirvdrum commented 7 years ago

Interesting. Can you try running just this one? I've isolated to just the int case and pulled the random call out of the hot loop. But to ensure we do use those random values, I've added iteration over the shuffled array:

require 'benchmark/ips'

ints = [*1..100]
int_samples = ints.shuffle

Benchmark.ips do |x|
  x.report('Array#include?(int)') { int_samples.each { |i| ints.include?(i) } }
  x.report('Array#index(int)') { int_samples.each { |i| ints.index(i) } }

  x.compare!
end

I'm really curious as to what your results look like with that.

lazebny commented 7 years ago

Here they are:

ruby 2.1.9p490 (2016-03-30 revision 54437) [x86_64-linux]
Warming up --------------------------------------
 Array#include?(int)   621.000  i/100ms
    Array#index(int)     3.760k i/100ms
Calculating -------------------------------------
 Array#include?(int)      6.318k (± 4.2%) i/s -     31.671k in   5.022748s
    Array#index(int)     38.544k (± 3.0%) i/s -    195.520k in   5.077354s

Comparison:
    Array#index(int):    38543.6 i/s
 Array#include?(int):     6318.3 i/s - 6.10x  slower
lazebny commented 7 years ago

Looks like it depends on ruby version:

ruby 2.3.0p0 (2015-12-25 revision 53290) [x86_64-linux]
Warming up --------------------------------------
 Array#include?(int)     3.088k i/100ms
    Array#index(int)     3.025k i/100ms
Calculating -------------------------------------
 Array#include?(int)     30.804k (± 5.0%) i/s -    154.400k in   5.029237s
    Array#index(int)     30.384k (± 1.4%) i/s -    154.275k in   5.078497s

Comparison:
 Array#include?(int):    30803.7 i/s
    Array#index(int):    30384.3 i/s - same-ish: difference falls within error
nirvdrum commented 7 years ago

Ahh. That would do it. Array#include? was improved for Ruby 2.2.0. In particular, you'll see a difference with anything that has special-handling in rb_equal_opt, such as ints & floats.

lazebny commented 7 years ago

Cool, good to know. I'll close this PR.

nirvdrum commented 7 years ago

No problem. Thanks for submitting. And I'm glad we were able to track down the issue.

mblumtritt commented 7 years ago

Cool findings - thanks guys!