rubys / nokogumbo

A Nokogiri interface to the Gumbo HTML5 parser.
Apache License 2.0
186 stars 114 forks source link

Register all global VALUES as mark objects #150

Closed jhawthorn closed 3 years ago

jhawthorn commented 3 years ago

Previously it was possible for these objects to be garbage collected (the constants could be unassigned in Ruby) or moved as part of GC compaction (reproducible with GC.verify_compaction_references).

This commit marks all the global variables in nokogumbo.c using rb_gc_register_mark_object to ensure that they can't be moved.

To test this, I added GC.verify_compaction_references(toward: :empty, double_heap: true) after the requires in test/test_nokogumbo.rb. This guarantees that as many objects are moved as possible.

Before

/Users/jhawthorn/src/nokogumbo/lib/nokogumbo/html5/document.rb:47: [BUG] Segmentation fault at 0x0000000000000010
ruby 2.7.0p0 (2019-12-25 revision 647ee6f091) [x86_64-darwin19]

-- Crash Report log information --------------------------------------------
   See Crash Report log file under the one of following:
     * ~/Library/Logs/DiagnosticReports
     * /Library/Logs/DiagnosticReports
   for more details.
Don't forget to include the above Crash Report log file in bug reports.

-- Control frame information -----------------------------------------------
c:0027 p:---- s:0165 e:000164 CFUNC  :parse
c:0026 p:0123 s:0156 e:000155 METHOD /Users/jhawthorn/src/nokogumbo/lib/nokogumbo/html5/document.rb:47
c:0025 p:0135 s:0143 e:000142 METHOD /Users/jhawthorn/src/nokogumbo/lib/nokogumbo/html5/document.rb:18
c:0024 p:0035 s:0134 e:000133 METHOD /Users/jhawthorn/src/nokogumbo/lib/nokogumbo/html5.rb:22
c:0023 p:0019 s:0125 e:000124 METHOD /Users/jhawthorn/src/nokogumbo/test/test_null.rb:85
------------------------------------8<-------------------------------------

-- C level backtrace information -------------------------------------------
/Users/jhawthorn/.rubies/ruby-2.7.0/bin/ruby(rb_vm_bugreport+0x96) [0x1066f79b6]
/Users/jhawthorn/.rubies/ruby-2.7.0/bin/ruby(rb_bug_for_fatal_signal+0x1e2) [0x106522042]
/Users/jhawthorn/.rubies/ruby-2.7.0/bin/ruby(sigsegv+0x5b) [0x10665220b]
/usr/lib/system/libsystem_platform.dylib(_sigtramp+0x1d) [0x7fff6c03f5fd]
/Users/jhawthorn/.rubies/ruby-2.7.0/bin/ruby(method_entry_get+0xb4) [0x1066d4c84]
/Users/jhawthorn/.rubies/ruby-2.7.0/bin/ruby(rb_callable_method_entry+0x29) [0x1066c9ad9]
/Users/jhawthorn/.rubies/ruby-2.7.0/bin/ruby(vm_search_method+0x1b7) [0x1066db2a7]
/Users/jhawthorn/.rubies/ruby-2.7.0/bin/ruby(rb_funcallv_with_cc+0x3f) [0x1066d627f]
/Users/jhawthorn/.rubies/ruby-2.7.0/bin/ruby(rb_obj_as_string+0x46) [0x10666c026]
/Users/jhawthorn/.rubies/ruby-2.7.0/bin/ruby(ruby__sfvextra+0x126) [0x106656746]
/Users/jhawthorn/.rubies/ruby-2.7.0/bin/ruby(BSD_vfprintf+0x138e) [0x106657c8e]
/Users/jhawthorn/.rubies/ruby-2.7.0/bin/ruby(rb_enc_vsprintf+0xb9) [0x106656409]
/Users/jhawthorn/.rubies/ruby-2.7.0/bin/ruby(rb_sprintf+0xa0) [0x10665a4b0]
/Users/jhawthorn/.rubies/ruby-2.7.0/bin/ruby(unexpected_type+0xcf) [0x10652272f]
/Users/jhawthorn/.rubies/ruby-2.7.0/bin/ruby(rb_unexpected_type+0x2a) [0x1065213da]
/Users/jhawthorn/.rubies/ruby-2.7.0/bin/ruby(rb_data_object_wrap+0x1aa) [0x10653d7fa]
/Users/jhawthorn/.gem/ruby/2.7.0/gems/nokogiri-1.10.10/lib/nokogiri/nokogiri.bundle(Nokogiri_wrap_xml_document+0x3e) [0x107d9546e]
/Users/jhawthorn/src/nokogumbo/lib/nokogumbo/nokogumbo.bundle(parse_continue+0x9d) [0x10800678d]
/Users/jhawthorn/.rubies/ruby-2.7.0/bin/ruby(rb_ensure+0xf0) [0x10652e540]
/Users/jhawthorn/src/nokogumbo/lib/nokogumbo/nokogumbo.bundle(parse+0xce) [0x108005f3e]
------------------------------------8<-------------------------------------

After

Run options: --seed 37771

# Running:

.....S...........................................S....S......S..................S..................S...............S..........S..............................S.........S........................................................................................................................................................................................................................................................................................................................................................................................................

Finished in 0.613078s, 913.4237 runs/s, 1355.4556 assertions/s.

560 runs, 831 assertions, 0 failures, 0 errors, 10 skips

You have skipped tests. Run with --verbose for details.
stevecheckoway commented 3 years ago

Hi,

Thanks for this patch! I'm a little surprised that Ruby's garbage collector doesn't treat global variables as GC roots.

Does Nokogiri have the same issue? E.g., here.

Is this a security issue? E.g., if one of these gets garbage collected, can subsequent calls be exploited to do anything other than crash?

jhawthorn commented 3 years ago

Does Nokogiri have the same issue? E.g., here.

rb_define_module_under implicitly marks the object. So constants defined by C are usually GC roots, but the same isn't true for Ruby objects. I suspect only Nokogiri::HTML5::Document really needs this, but I figured we should mark any constants we got from rb_const_get (since this gem didn't create them we shouldn't assume).

stevecheckoway commented 3 years ago

Makes sense.

stevecheckoway commented 3 years ago

Thanks so much for this fix!

jhawthorn commented 3 years ago

Thank you!

stevecheckoway commented 3 years ago

I just released a new version with your fix. It's available on rubygems. Thanks again!