Open monfresh opened 7 years ago
In your sample you sort the ARRAY
twice - it's already sorted when the second sample runs. The results may be therefore wrong…
I don't follow. ARRAY is not mutated. If you call ARRAY.sort_by(&:length)
, and then call ARRAY
again, you get the original ARRAY. You can verify this by adding .freeze
to the end of the ARRAY assignment (ARRAY = %w(apple pear fig).freeze
)
I get the same results when I change the order of the samples:
Warming up --------------------------------------
sort 121.953k i/100ms
sort_by 59.477k i/100ms
Calculating -------------------------------------
sort 1.898M (± 3.8%) i/s - 9.512M in 5.018772s
sort_by 704.695k (±11.5%) i/s - 3.450M in 5.002361s
Comparison:
sort: 1898186.9 i/s
sort_by: 704694.9 i/s - 2.69x slower
The code example in this repo also does the same thing: it defines the array to be used in the samples as a constant: https://github.com/JuanitoFatas/fast-ruby/blob/master/code/enumerable/sort-vs-sort_by.rb
@monfresh Thanks for pointing this out. There are two ways to resolve into the current example,
@JuanitoFatas Any comments about above?
@monfresh Sorry - I used in my local sample sort!
and sort_by!
and did ignore your code details. Sorry …
I just ran into a situation where sort_by
is not faster, per se, in fact 2.95x slower. But it has something to do with the number of items in the array.
require 'benchmark/ips'
records = SomeModel.group(:some_column).sum(:value)
Benchmark.ips do |x|
x.report("with #sort") do |n|
n.times { records.sort { |x, y| y[1] <=> x[1] } }
end
x.report("with #sort_by") do |n|
n.times { records.sort_by(&:last) }
end
# Compare the iterations per second of the various reports!
x.compare!
end
Yields
Warming up --------------------------------------
with #sort 95.145k i/100ms
with #sort_by 41.329k i/100ms
Calculating -------------------------------------
with #sort 1.350M (±17.9%) i/s - 6.280M in 5.040178s
with #sort_by 457.236k (±27.8%) i/s - 2.025M in 5.034366s
Comparison:
with #sort: 1350318.7 i/s
with #sort_by: 457235.7 i/s - 2.95x slower
This is with 2 items in the result. When I bump this to 100:
Warming up --------------------------------------
with #sort 1.029k i/100ms
with #sort_by 1.365k i/100ms
Calculating -------------------------------------
with #sort 11.344k (±14.4%) i/s - 55.566k in 5.046672s
with #sort_by 14.750k (±14.4%) i/s - 72.345k in 5.023649s
Comparison:
with #sort_by: 14750.3 i/s
with #sort: 11344.5 i/s - same-ish: difference falls within error
Apparently there is a correlation between the number of items and the performance difference, which makes sense of course. Adding items to the array brings us closer to the 'true' performance difference between sort
and sort_by
.
So lets bump to 1000:
Calculating -------------------------------------
with #sort 667.572 (±16.6%) i/s - 3.245k in 5.017742s
with #sort_by 953.283 (±15.1%) i/s - 4.692k in 5.043402s
Comparison:
with #sort_by: 953.3 i/s
with #sort: 667.6 i/s - 1.43x slower
So at 1000 items sort_by
becomes faster.
I don't know how we'd go about this, but adding a note about the expected number of items in the array would make sense.
🙏
Per the Ruby 2.4.0 docs:
Here's a concrete example showing
sort
to be 2.70x faster in Ruby 1.9.3, 2.3.3 and 2.4.0: