Closed xaviershay closed 9 years ago
https://github.com/hamstergem/hamster/commits/933781fa222e680995132b492803f6cbb740c6ae appears to be the first bad commit.
Maybe related to https://bugs.ruby-lang.org/issues/11071 ?
Just duplicated this with 2.3.0-dev. "Stack consistency error (sp: 47, bp: 48)". Intriguing!
I have a Ruby interpreter which I have built with full debug info, which makes the stack trace much better:
-- C level backtrace information -------------------------------------------
/home/alex/Software/ruby/ruby(rb_vm_bugreport+0x51f) [0x55e28ea69acf] vm_dump.c:695
/home/alex/Software/ruby/ruby(rb_bug+0xca) [0x55e28ead4f8a] error.c:414
/home/alex/Software/ruby/ruby(vm_exec_core+0x59a6) [0x55e28ea5b856] insns.def:1031
/home/alex/Software/ruby/ruby(vm_exec+0x72) [0x55e28ea5b902] vm.c:1472
/home/alex/Software/ruby/ruby(invoke_block_from_c+0x5d4) [0x55e28ea5c634] vm.c:850
/home/alex/Software/ruby/ruby(rb_yield+0x68) [0x55e28ea5c9e8] vm.c:888
/home/alex/Software/ruby/ruby(rb_ary_each+0x3d) [0x55e28ea8609d] array.c:1820
I've found what is causing this error. It would be nice if it was just a simple coding error, but unfortunately it seems to be more of a conceptual error in the design of the block-calling and argument-passing mechanism. If only I could wrap my mind around the right way to fix it up...
OK, sigh of relief. It seems that the fix wasn't as tricky as I had thought. Have a look: https://bugs.ruby-lang.org/issues/11451
@xaviershay, I seriously doubted that this had anything to do with https://bugs.ruby-lang.org/issues/11071, but it turns out you were right! My patch also fixes that one.
@xaviershay, I said it in the Ruby core bug tracker, but I'll say it here too: thanks for reporting this problem!
If and when you run across a similar problem in the future, try to eliminate all gems and get a repro script which uses only core Ruby. You can do this by repeatedly inlining gem method calls (by hand), and checking each time to see if the problem still occurs (and backing up if it doesn't).
After each inlining, you will usually be able to delete some unneeded code and simplify what remains. After each change, check again to see if the script still crashes or not. If inlining makes the crash "disappear" -- in other words, if the method call itself is part of the problem -- just copy-and-paste the gem code into the repro script. Then remove as much as you can, while checking that the crash still occurs.
Your repro script was already quite small, which was a big help. But I reduced it further down to this:
class Yielder
def each
yield [1,2]
end
end
class Getter1
include Enumerable
def each(&block)
Yielder.new.each(&block)
end
end
class Getter2
include Enumerable
def each
Yielder.new.each { |a,b| yield(a) }
end
end
Getter1.new.map { Getter2.new.map {} }
Yielder
was originally Hamster::Trie
, Getter1
was Hamster::Hash
, and Getter2
was Hamster::Set
. The core Ruby devs will appreciate it if they don't have to wade through 1000s of lines of gem code to track down a crash. (More realistically, they may actually look into it rather than ignoring it.)
So, having said all that, now we can look at fixing up Hamster. I would suggest that we apply @xaviershay's fix, with a comment stating that the modified code is for MRI 2.2.2 compatibility and that the change will be reverted when 2.2.2 is sufficiently old. (Because the changed code will almost certainly be slower.) Whenever that time comes, we can add a warning message if someone uses the gem on MRI 2.2.2.
Comments from other contributors? @dubek @krainboltgreene
Nice find!
Sold.
On Mon, Aug 17, 2015 at 9:41 AM, Xavier Shay notifications@github.com wrote:
Nice find!
— Reply to this email directly or view it on GitHub https://github.com/hamstergem/hamster/issues/189#issuecomment-131844954.
Kurtis Rainbolt-Greene, Hacker Software Developer 1631 8th St. New Orleans, LA 70115
Micro-benchmark suggesting expected size of performance hit. Would be the difference between the final two cases, though obviously actual benchmark of this code is needed:
require 'benchmark/ips'
def a
yield
yield
end
def b(&block)
yield
yield
end
def c(&block)
yield
block.call
end
def d(&block)
block.call
block.call
end
Benchmark.ips do |x|
y = 0
x.report("yield only") { a { y + 1 } }
x.report("yield with &block") { b { y + 1 } }
x.report("mixed with &block") { c { y + 1 } }
x.report("&block only") { d { y + 1 } }
end
Calculating -------------------------------------
yield only 61.039k i/100ms
yield with &block 34.155k i/100ms
mixed with &block 32.514k i/100ms
&block only 32.829k i/100ms
-------------------------------------------------
yield only 2.188M (± 6.9%) i/s - 10.926M
yield with &block 710.589k (± 4.8%) i/s - 3.552M
mixed with &block 653.130k (± 3.7%) i/s - 3.284M
&block only 611.241k (± 3.6%) i/s - 3.053M
Already merged @xaviershay's fix.
@alexdowad I'm not sure if you want to revisit this yet, but I was playing with this while working on some benchmarks in a downstream project and pushed up a branch.
I wasn't able to confirm that this has a real impact on performance (my benchmark/ips
results vary too much), but raw calls to block.call
do still seem as much as 4x slower than yield
on Ruby 2.3.1.
@no-reply Interesting. I remember reading (can't find the source) that actually adding a &blk
parameter to a method slows things down, because Ruby converts the block into a Proc object (which is later #call
-ed).
So, supposedly, the fastest would be to eliminate &blk
parameter and use yield
(and maybe block_given?
to check if we have a block at all. Of course this need benchmarking and might change in different Rubies. Here's a simple benchmark (which probably means wrong benchmark :-) ) on my machine:
require 'benchmark/ips'
def f1(&blk)
blk.call
end
def f2(&blk)
yield
end
def f3
yield
end
def f4
yield if block_given?
end
Benchmark.ips do |b|
b.report("blk.call") { f1 { 100 * 100 } }
b.report("blk and yield") { f2 { 100 * 100 } }
b.report("yield") { f3 { 100 * 100 } }
b.report("block_given? and yield (with block)") { f4 { 100 * 100 } }
b.report("block_given? and yield (without block)") { f4 }
end
Results:
blk.call 1.779M (± 2.5%) i/s - 8.955M
blk and yield 2.215M (± 3.0%) i/s - 11.081M
yield 6.639M (± 3.2%) i/s - 33.206M
block_given? and yield (with block)
4.570M (± 2.4%) i/s - 22.854M
block_given? and yield (without block)
6.677M (± 2.4%) i/s - 33.451M
So removing the &blk
parameters improved from 2.2M i/s to 6.6M i/s. Get rid of those &blk
! :-)
Oops just saw I'm repeating the benchmark already above on this thread.... Too late I guess. Sorry.
@dubek :+1: My branch linked above now has a much more complete (if somewhat slapdash) rework of &block
usage throughout.
I'll open a new PR so the discussion can happen somewhere other than the bottom of a closed ticket.
The following segfaults on master (not released 1.0.0 gem) 2.2.2 and above. Can be seen in travis CI.
I am investigating.