ruby / set

This library provides the Set class, which deals with a collection of unordered values with no duplicates.
BSD 2-Clause "Simplified" License
23 stars 13 forks source link

Avoid the `block or return` pattern to save Proc allocations #29

Closed casperisfine closed 1 year ago

casperisfine commented 1 year ago

Using the block param in a boolean context like this cause it to be allocated.

Using it with an if or unless was optimized in 3.2 (https://github.com/ruby/ruby/pull/6286) but using it with or or and wasn't.

def foo(&block)
  block or return 1
end

puts RubyVM::InstructionSequence.of(method(:foo)).disasm
== disasm: #<ISeq:foo@(irb):11 (11,0)-(13,3)> (catch: false)
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: 0, kw: -1@-1, kwrest: -1])
[ 1] block@0<Block>
0000 getblockparam                          block@0, 0                (  12)[LiCa]
0003 dup
0004 branchif                               10
0006 pop
0007 putobject_INT2FIX_1_
0008 leave                                  [Re]
0009 putnil
0010 leave

versus

def foo(&block)
  return 1 if block
end

puts RubyVM::InstructionSequence.of(method(:foo)).disasm
== disasm: #<ISeq:foo@(irb):15 (15,0)-(17,3)> (catch: false)
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: 0, kw: -1@-1, kwrest: -1])
[ 1] block@0<Block>
0000 getblockparamproxy                     block@0, 0                (  16)[LiCa]
0003 branchunless                           7
0005 putobject_INT2FIX_1_
0006 leave                                                            (  17)[Re]
0007 putnil                                                           (  16)
0008 leave

That block_given? or pattern was used in most other methods, but not these 3 for some reason.

This patch has a fairly big impact on our app:

allocated objects by location
-----------------------------------
# ...
    257373  ruby/lib/ruby/3.1.0/set.rb:510

FYI @jhawthorn cc @knu @hsbt @marcandre

jhawthorn commented 1 year ago

(I'm late to reply 😅) There's on harm in this and it will be faster on older rubies but in Ruby 3.2 we do actually avoid the block allocation as long as it can't be returned.

$ ruby --dump=insns -e 'def foo(&block); block or return 1; end'
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,39)> (catch: false)
0000 definemethod                           :foo, foo                 (   1)[Li]
0003 putobject                              :foo
0005 leave

== disasm: #<ISeq:foo@-e:1 (1,0)-(1,39)> (catch: false)
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: 0, kw: -1@-1, kwrest: -1])
[ 1] block@0<Block>
0000 getblockparam                          block@0, 0                (   1)[LiCa]
0003 dup
0004 branchif                               10
0006 pop
0007 putobject_INT2FIX_1_
0008 leave                                  [Re]
0009 putnil
0010 leave                                  [Re]

We use getblockparam because it will be returned if non-null. However if we don't implicitly return it, then the peephole optimizer will see that it's only used as a boolean and use getblockparamproxy instead

$ ruby --dump=insns -e 'def foo(&block); block or return 1; nil; end'
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,44)> (catch: false)
0000 definemethod                           :foo, foo                 (   1)[Li]
0003 putobject                              :foo
0005 leave

== disasm: #<ISeq:foo@-e:1 (1,0)-(1,44)> (catch: false)
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: 0, kw: -1@-1, kwrest: -1])
[ 1] block@0<Block>
0000 getblockparamproxy                     block@0, 0                (   1)[LiCa]
0003 branchif                               7
0005 putobject_INT2FIX_1_
0006 leave                                  [Re]
0007 putnil
0008 leave                                  [Re]

None of that's to say we shouldn't have this change. I think it's more readable (and faster on 3.1 as demonstrated).

byroot commented 1 year ago

👍 good catch.