Suggestion from Performance/BlockGivenWithExplicitBlock is often slower

Performance/BlockGivenWithExplicitBlock provides bad guidance and should be removed. The benchmark which was given when it was introduced didn't test the case that a block was given, when that happens block is significantly slower than the other option.

For Ruby 3.2+ I made if block faster, but prior to that the advice from this cop was always wrong. Even in 3.2+ I believe it's bad advice as there's only a very narrow usage for which block is (very slightly) faster, and even in those cases it's likely a future refactor the user will accidentally introduce the slow case (ex. moving from if block to if block && something_else)

require 'bundler/inline'

gemfile(true) do
  gem 'benchmark-ips'
end

def if_block(&block)
  !!block
end

def if_block_given(&block)
  !!block_given?
end

Benchmark.ips do |x|
  x.report('block')                  { if_block }
  x.report('block w/ block')         { if_block {} }
  x.report('block_given?')           { if_block_given }
  x.report('block_given? w/ block')  { if_block_given {} }
  x.compare!
end

Warming up --------------------------------------
            block     1.612M i/100ms
   block w/ block   508.391k i/100ms
     block_given?     1.309M i/100ms
block_given? w/ block
                         1.068M i/100ms
Calculating -------------------------------------
            block     15.804M (± 1.8%) i/s -     79.000M in   5.000346s
   block w/ block      5.017M (± 0.5%) i/s -     25.420M in   5.067106s
     block_given?     13.297M (± 1.7%) i/s -     66.746M in   5.021282s
block_given? w/ block
                         10.745M (± 0.6%) i/s -     54.450M in   5.067787s

Comparison:
            block: 15804072.8 i/s
     block_given?: 13296844.8 i/s - 1.19x  slower
block_given? w/ block: 10744860.7 i/s - 1.47x  slower
   block w/ block:  5016692.5 i/s - 3.15x  slower

To be fair, the assumption is that with block_given? you can often avoid declaring &block:

require 'bundler/inline'

gemfile(true) do
  gem 'benchmark-ips'
end

def if_block(&block)
  !!block
end

def if_block_given
  block_given?
end

Benchmark.ips do |x|
  x.report('block')                  { if_block }
  x.report('block w/ block')         { if_block {} }
  x.report('block_given?')           { if_block_given }
  x.report('block_given? w/ block')  { if_block_given {} }
  x.compare!(order: :baseline)
end

ruby 3.2.2 (2023-03-30 revision e51014f9c0) [arm64-darwin22]
Resolving dependencies...
Warming up --------------------------------------
               block     1.343M i/100ms
      block w/ block   513.029k i/100ms
        block_given?     1.465M i/100ms
block_given? w/ block
                         1.457M i/100ms
Calculating -------------------------------------
               block     13.177M (± 2.0%) i/s -     67.134M in   5.096854s
      block w/ block      5.471M (± 4.3%) i/s -     27.704M in   5.074617s
        block_given?     14.664M (± 0.8%) i/s -     74.736M in   5.096843s
block_given? w/ block
                         14.537M (± 0.8%) i/s -     72.851M in   5.011781s

Comparison:
               block: 13176873.3 i/s
        block_given?: 14664219.5 i/s - 1.11x  faster
block_given? w/ block: 14537095.2 i/s - 1.10x  faster
      block w/ block:  5470680.3 i/s - 2.41x  slower

But even then, the difference is too small to be worth in unless you enter the case where the Proc has to be instantiated.

require 'bundler/inline'

gemfile(true) do
  gem 'benchmark-ips'
end

def if_block(&block)
  block ? true : false
end

def if_block_given
  block_given?
end

Benchmark.ips do |x|
  x.report('block')                  { if_block }
  x.report('block w/ block')         { if_block {} }
  x.report('block_given?')           { if_block_given }
  x.report('block_given? w/ block')  { if_block_given {} }
  x.compare!(order: :baseline)
end

Warming up --------------------------------------
               block     1.428M i/100ms
      block w/ block     1.321M i/100ms
        block_given?     1.456M i/100ms
block_given? w/ block
                         1.451M i/100ms
Calculating -------------------------------------
               block     14.424M (± 1.6%) i/s -     72.851M in   5.052055s
      block w/ block     13.159M (± 1.4%) i/s -     66.037M in   5.019221s
        block_given?     14.549M (± 1.4%) i/s -     72.786M in   5.003860s
block_given? w/ block
                         14.404M (± 1.3%) i/s -     72.529M in   5.036180s

Comparison:
               block: 14424088.9 i/s
        block_given?: 14548868.5 i/s - same-ish: difference falls within error
block_given? w/ block: 14404275.0 i/s - same-ish: difference falls within error
      block w/ block: 13159479.3 i/s - 1.10x  slower

To be fair, the assumption is that with block_given? you can often avoid declaring &block:

That's another point in favour of removing the linter, which tells you to replace block_given? with block.

ROFL, I just assumed it was advocating for block_given?, I couldn't imagine &block would ever be faster. Same perf at best.

So yeah, definite 👍

Following this Cop's advice produces significantly worse results on Ruby < 3.2 if a method forwards checks whether a block was given and then just forwards it. Consider the following example where we may call one of the outer methods either with a block or without one:

# frozen_string_literal: true

require 'bundler/inline'
gemfile(true) do
  gem 'benchmark-ips'
end

def each
  3.times { |i| yield i }
end

def outer1(&block)
 return enum_for(__method__) unless block_given?
 each(&block)
end

def outer2(&block)
 return enum_for(__method__) unless block
 each(&block)
end

Benchmark.ips do |x|
  x.report('outer1') { outer1 { } }
  x.report('outer2') { outer2 { } }

  x.compare!
end

With Ruby < 2.6 (I tested down to Ruby 2.3), both outer methods are about equally slow since the forwarded block is always materialized as a Proc when calling the each method:

ruby 2.3.8p459 (2018-10-18 revision 65136) [x86_64-darwin17]
Warming up --------------------------------------
              outer1   174.019k i/100ms
              outer2   135.977k i/100ms
Calculating -------------------------------------
              outer1      1.398M (± 7.2%) i/s -      7.135M in   5.129078s
              outer2      1.547M (±13.5%) i/s -      7.615M in   5.022678s

Comparison:
              outer2:  1547464.0 i/s
              outer1:  1397649.7 i/s - same-ish: difference falls within error

With Ruby >= 2.6, < 3.2 however, the block is not strictly materialized as a Proc when forwarding anymore. Thus, when passing block, outer2 is roughly 2 times slower than outer1:

ruby 2.6.4p104 (2019-08-28 revision 67798) [x86_64-darwin17]
Warming up --------------------------------------
              outer1   391.625k i/100ms
              outer2   199.534k i/100ms
Calculating -------------------------------------
              outer1      3.861M (± 3.0%) i/s -     19.581M in   5.076225s
              outer2      1.947M (± 4.1%) i/s -      9.777M in   5.030709s

Comparison:
              outer1:  3861370.8 i/s
              outer2:  1947189.7 i/s - 1.98x  (± 0.00) slower

ruby 3.1.3p185 (2022-11-24 revision 1a6b16756e) [x86_64-darwin19]
Warming up --------------------------------------
              outer1   271.669k i/100ms
              outer2   155.330k i/100ms
Calculating -------------------------------------
              outer1      2.712M (± 1.3%) i/s -     13.583M in   5.009887s
              outer2      1.476M (± 9.8%) i/s -      7.301M in   5.006872s

Comparison:
              outer1:  2711812.2 i/s
              outer2:  1475957.7 i/s - 1.84x  slower

Only from Ruby 3.2 on, this improved and there is no actual difference between both options anymore — until you unwittingly introduce another pattern which again materializes the block as indicated by @jhawthorn above.

ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-darwin23]
Warming up --------------------------------------
              outer1   296.872k i/100ms
              outer2   310.487k i/100ms
Calculating -------------------------------------
              outer1      2.737M (± 7.4%) i/s -     13.656M in   5.018464s
              outer2      2.893M (± 6.1%) i/s -     14.593M in   5.065345s

Comparison:
              outer2:  2892991.1 i/s
              outer1:  2737445.0 i/s - same-ish: difference falls within error

Thus, at the very least you should restrict the cop to be only active in Ruby >= 3.2.0. Here, it doesn't actively cause harm at least, although it doesn't provide much benefit too. As such, I'd also be happy if the cop is fully disabled by default.

I openend #466 to disable by default. I don't think there's value in trying to update it into something that may give worse results in the next ruby version yet again. There's been some great performance work in Ruby itself lately and I wouldn't be surprised if these difference versions continue to converge. With YJIT, benchmark-ips doesn't even show the difference anymore (except that the correction is even worse)

rubocop / rubocop-performance

Suggestion from Performance/BlockGivenWithExplicitBlock is often slower #385