Open corsonknowles opened 2 months ago
Some of the gain here comes from simply not allocating the block. But I think most of it comes from the lower level conversion of the array into an array of arrays.
Dear @koic and @Earlopain, Are there any questions I can answer for you or modifications I can provide?
Thank you so much for your consideration!
I really appreciate the thoughtful comments @Earlopain
Fully revised and ready for review.
I had run the original version on 50,000 files. I added 2 more tests and diversified the input slightly to get confident in the new on_send
approach. I predict this should be safe to merge now.
Thanks for the benchmark-ips
tip. I added some data using it for large and small arrays.
First off, this will be a nice improvement for our code base, as we often use and recommend the map {|id| [id]}
for Sidekiq.perform_bulk
. Disclaimer: work with @corsonknowles 😁
I ran some of my own benchmarks based on @corsonknowles's in the body against small (4), medium (1000) and big (50k) arrays using benchmark-ips, and also added benchmark-memory for good measure. High level, we have:
I'm kinda surprised, but memory usage ends up the same which I'm not quite sure how to explain yet 😅
Here is the benchmark:
require 'benchmark/ips'
require 'benchmark/memory'
arrays = {
small: (1..4).to_a,
medium: (1..1000).to_a,
big: (1..50000).to_a,
}
arrays.each do |size, array|
puts "=== #{size} ==="
Benchmark.ips do |x|
x.report(".map { |id| [id] }:") do
array.map { |id| [id] }
end
x.report(".zip:") do
array.zip
end
x.compare! order: :baseline
end
Benchmark.memory do |x|
x.report(".map { |id| [id] }:") do
array.map { |id| [id] }
end
x.report(".zip:") do
array.zip
end
end
end
And the results:
=== small ===
ruby 3.3.2 (2024-05-30 revision e5a195edf6) [arm64-darwin23]
Warming up --------------------------------------
.map { |id| [id] }: 405.112k i/100ms
.zip: 590.068k i/100ms
Calculating -------------------------------------
.map { |id| [id] }: 4.029M (± 1.1%) i/s (248.17 ns/i) - 20.256M in 5.027463s
.zip: 5.846M (± 6.7%) i/s (171.06 ns/i) - 29.503M in 5.085950s
Comparison:
.map { |id| [id] }:: 4029499.2 i/s
.zip:: 5845845.8 i/s - 1.45x faster
Calculating -------------------------------------
.map { |id| [id] }: 240.000 memsize ( 0.000 retained)
5.000 objects ( 0.000 retained)
0.000 strings ( 0.000 retained)
.zip: 240.000 memsize ( 0.000 retained)
5.000 objects ( 0.000 retained)
0.000 strings ( 0.000 retained)
=== medium ===
ruby 3.3.2 (2024-05-30 revision e5a195edf6) [arm64-darwin23]
Warming up --------------------------------------
.map { |id| [id] }: 2.042k i/100ms
.zip: 3.533k i/100ms
Calculating -------------------------------------
.map { |id| [id] }: 20.169k (± 1.7%) i/s (49.58 μs/i) - 102.100k in 5.063731s
.zip: 35.106k (± 1.6%) i/s (28.48 μs/i) - 176.650k in 5.033169s
Comparison:
.map { |id| [id] }:: 20169.1 i/s
.zip:: 35106.3 i/s - 1.74x faster
Calculating -------------------------------------
.map { |id| [id] }: 48.040k memsize ( 0.000 retained)
1.001k objects ( 0.000 retained)
0.000 strings ( 0.000 retained)
.zip: 48.040k memsize ( 0.000 retained)
1.001k objects ( 0.000 retained)
0.000 strings ( 0.000 retained)
=== big ===
ruby 3.3.2 (2024-05-30 revision e5a195edf6) [arm64-darwin23]
Warming up --------------------------------------
.map { |id| [id] }: 44.000 i/100ms
.zip: 64.000 i/100ms
Calculating -------------------------------------
.map { |id| [id] }: 425.048 (± 0.7%) i/s (2.35 ms/i) - 2.156k in 5.072641s
.zip: 818.393 (± 2.3%) i/s (1.22 ms/i) - 4.096k in 5.007817s
Comparison:
.map { |id| [id] }:: 425.0 i/s
.zip:: 818.4 i/s - 1.93x faster
Calculating -------------------------------------
.map { |id| [id] }: 2.400M memsize ( 0.000 retained)
50.001k objects ( 0.000 retained)
0.000 strings ( 0.000 retained)
.zip: 2.400M memsize ( 0.000 retained)
50.001k objects ( 0.000 retained)
0.000 strings ( 0.000 retained)
Thanks @Earlopain !
on_send
nowRangeHelp
to get the corrector range to match both the method call and the block, but not map's receiver. RESTRICT_ON_SEND
placement, which seems to be the norm here for constant declaration order.For the very curious, .zip
is also faster than .map! {[_1]}
.
.each_with_object([]) {|id, object| object << [id]}
is among the slowest of the reasonable strategies.
.each_with_object([]) {|id, object| object << [id]} is among the slowest of the reasonable strategies.
I added it to my benchmark, and it is slower AND takes a lot more memory:
=== big ===
ruby 3.3.2 (2024-05-30 revision e5a195edf6) [arm64-darwin23]
Warming up --------------------------------------
.map { |id| [id] }: 43.000 i/100ms
.zip: 64.000 i/100ms
.each_with_object([]) {|id, object| object << [id]}:
32.000 i/100ms
Calculating -------------------------------------
.map { |id| [id] }: 458.271 (± 1.3%) i/s (2.18 ms/i) - 2.322k in 5.067609s
.zip: 622.430 (± 7.7%) i/s (1.61 ms/i) - 3.136k in 5.087861s
.each_with_object([]) {|id, object| object << [id]}:
327.959 (± 2.4%) i/s (3.05 ms/i) - 1.664k in 5.077224s
Comparison:
.map { |id| [id] }:: 458.3 i/s
.zip:: 622.4 i/s - 1.36x faster
.each_with_object([]) {|id, object| object << [id]}:: 328.0 i/s - 1.40x slower
Calculating -------------------------------------
.map { |id| [id] }: 2.400M memsize ( 0.000 retained)
50.001k objects ( 0.000 retained)
0.000 strings ( 0.000 retained)
.zip: 2.400M memsize ( 0.000 retained)
50.001k objects ( 0.000 retained)
0.000 strings ( 0.000 retained)
.each_with_object([]) {|id, object| object << [id]}:
2.454M memsize ( 0.000 retained)
50.001k objects ( 0.000 retained)
0.000 strings ( 0.000 retained)
@koic Is there anything else I can do to help get this ready to merge?
@bbatsov @rrosenblum I'm looking for committers. Are you able to take a look at this PR?
Thanks so much.
This is a pretty cool optimization for a specific use case.
I think it would be great to add at least 1 test for a multi-line example
[1, 2, 3].map do |id| [id] end
Done!
@koic @rrosenblum @Earlopain Small update here, I renamed this cop in line with:
Performance Cop for the more efficient way to generate an Array of Arrays.
.map
to iteratively wrap array contents..map
, depending on size..map!
each_with_object
This optimization is particularly helpful in performance sensitive paths or in large scale applications that need to generate arrays of arrays at scale. For example, leveraging the bulk enqueuing feature in Sidekiq requires an array of arrays, and is by definition used at scale.
A performance gain is always present for all sizes of arrays and ranges. The gain is smallest with small ranges, but still significant for small arrays.
This is not a style cop, but it is poetic that the more performant approach also has simpler and shorter syntax.
.zip
has been intentionally optimized in Ruby. This has been discussed publicly since at least 2012:Official
.zip
documentation:Source code for
.zip
:This performance cop isn't just an announcement to use
.zip
in the spirit of appreciating Ruby's great features, it is also a useful and necessary tool to leverage Rubocop to clean up and add rigor in large Ruby code bases. Rubocop is a much better approach than pattern matching for clean up at scale here, and it comes with the added benefit of proactive user feedback as additions are made to the code base going forward.For Arrays with 1000 entries, a common size for bulk operations, this performs 70% faster. Here it is at 70% faster, using
benchmark-ips
:Before submitting the PR make sure the following are checked:
[Fix #issue-number]
(if the related issue exists).master
(if not - rebase it).bundle exec rake default
. It executes all tests and runs RuboCop on its own code.{change_type}_{change_description}.md
if the new code introduces user-observable changes. See changelog entry format for details.