Return nil from assignment benchmarks

haileys commented 9 years ago

The parallel assignment benchmark only allocates an array because the assignment expression is the last expression in the method (and so its result is returned to the caller). If Ruby detects that an expression's result is unused (as would be the case with most assignment expressions), it will avoid allocating the array and just assign directly.

Inspecting the bytecode generated for parallel assignment in both cases shows off this optimisation:

λ ruby --dump=insns -e 'a, b, c = 1, 2, 3'
== disasm: <RubyVM::InstructionSequence:<main>@-e>======================
local table (size: 4, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, keyword: 0@5] s1)
[ 4] a          [ 3] b          [ 2] c
0000 trace            1                                               (   1)
0002 duparray         [1, 2, 3]
0004 dup
0005 expandarray      3, 0
0008 setlocal_OP__WC__0 4
0010 setlocal_OP__WC__0 3
0012 setlocal_OP__WC__0 2
0014 leave

λ ruby --dump=insns -e 'a, b, c = 1, 2, 3; nil'
== disasm: <RubyVM::InstructionSequence:<main>@-e>======================
local table (size: 4, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, keyword: 0@5] s1)
[ 4] a          [ 3] b          [ 2] c
0000 trace            1                                               (   1)
0002 putobject_OP_INT2FIX_O_1_C_
0003 putobject        2
0005 putobject        3
0007 setlocal_OP__WC__0 2
0009 setlocal_OP__WC__0 3
0011 setlocal_OP__WC__0 4
0013 trace            1
0015 putnil
0016 leave

In the first bytecode dump, Ruby has no option but to allocate the array because the assignment expression is the last expression. Even still, we can see that the Ruby VM is still clever enough to see that each element in the array has no evaluation side effects and so it is able to just dup an pre-allocated array rather than building a new one up from scratch.

In the second bytecode dump, because the assignment is not the last expression and does not need to return a value, the Ruby VM takes a shortcut and just assigns the variables directly rather than creating an array. In fact, this is even faster than splitting these assignments out over multiple lines because the compiler does not need to emit a per-line trace instruction for each assignment.

Here's the benchmark results before changing the benchmarked methods to return nil:

Calculating -------------------------------------
 Parallel Assignment    81.224k i/100ms
Sequential Assignment
                       101.963k i/100ms
-------------------------------------------------
 Parallel Assignment      2.610M (± 3.2%) i/s -     13.077M
Sequential Assignment
                          6.113M (± 3.7%) i/s -     30.589M

Comparison:
Sequential Assignment:  6113471.7 i/s
 Parallel Assignment:  2609542.4 i/s - 2.34x slower

The benchmark results after changing the benchmarked methods to return nil show a clear improvement in favour of parallel assignment:

Calculating -------------------------------------
 Parallel Assignment   103.520k i/100ms
Sequential Assignment
                       105.474k i/100ms
-------------------------------------------------
 Parallel Assignment      7.049M (± 2.7%) i/s -     35.300M
Sequential Assignment
                          6.159M (± 2.3%) i/s -     30.798M

Comparison:
 Parallel Assignment:  7048523.9 i/s
Sequential Assignment:  6159203.7 i/s - 1.14x slower

JuanitoFatas commented 9 years ago

@charliesome Thank you for your detailed explanations. :bow: I was so wrong :sweat:

etiennebarrie commented 9 years ago

So now the fast one is called slow and vice versa, right?

haileys commented 9 years ago

@etiennebarrie Oh heh, yeah I forgot to update that

JuanitoFatas commented 9 years ago

@etiennebarrie Fixed in https://github.com/JuanitoFatas/fast-ruby/commit/53894bf4e7c91af36a27a65cc0ae2decd368a743, thanks!

fastruby / fast-ruby

Return nil from assignment benchmarks #50