oracle / truffleruby

A high performance implementation of the Ruby programming language, built on GraalVM.
https://www.graalvm.org/ruby/
Other
3.02k stars 185 forks source link

block_given? issue when called via C in Nokogiri #2675

Closed chrisseaton closed 2 years ago

chrisseaton commented 2 years ago
Error:
Nokogiri::HTML4::DocumentFragment::parse options::HTML4::DocumentFragment.parse#test_0003_takes a config block:
NoMethodError: undefined method `strict?' for nil:NilClass
    /__w/nokogiri/nokogiri/test/html4/test_document_fragment.rb:376:in `test_0003_takes a config block'

This seems to be caused by a block_given? failing due to a method being very distant from where it's logically passed the block due C code in between.

BlockGivenNode here seems to be getting the frame of /Users/chrisseaton/Documents/nokogiri/lib/nokogiri/html4/document_fragment.rb:27, which looks right, but indeed it doesn't have a block. How should rb_obj_call_init be passing it, and how should xml_document_fragment.c#new be passing it?

/Users/chrisseaton/Documents/nokogiri/lib/nokogiri/html4/document_fragment.rb:27:in `initialize'
/Users/chrisseaton/Documents/graalvm/graal/sdk/mxbuild/darwin-aarch64/GRAALVM_B4632A0AE8_JAVA11/graalvm-b4632a0ae8-java11-22.2.0-dev/Contents/Home/languages/ruby/lib/truffle/truffle/cext.rb:1221:in `rb_obj_call_init'
object.c:128:in `rb_obj_call_init'
/Users/chrisseaton/Documents/nokogiri/ext/nokogiri/xml_document_fragment.c:29:in `new'
/Users/chrisseaton/Documents/graalvm/graal/sdk/mxbuild/darwin-aarch64/GRAALVM_B4632A0AE8_JAVA11/graalvm-b4632a0ae8-java11-22.2.0-dev/Contents/Home/languages/ruby/lib/truffle/truffle/cext_ruby.rb:41:in `new'
/Users/chrisseaton/Documents/nokogiri/lib/nokogiri/html4/document_fragment.rb:24:in `parse'
test.rb:7:in `<main>'

Reproduce with

require "nokogiri"

input = "<div>foo</div"

GC.start # then break on BlockGivenNode

Nokogiri::HTML4::DocumentFragment.parse(input) do |config|
  puts 'here'
end

@flavorjones

aardvark179 commented 2 years ago

CRuby has a macro PASS_PASSED_BLOCK_HANDLER() which it uses explicitly in a couple of places to pass the block. I'll add a spec and refactor our code to do the same sort of thing.

aardvark179 commented 2 years ago

A fix for this has now been pushed to master.

chrisseaton commented 2 years ago

https://github.com/oracle/truffleruby/commit/a4e867a2bd66baa7824e317a9d9ae4eb83f4261a