mmtk / mmtk-ruby

Ruby binding for MMTk
Other
9 stars 5 forks source link

FL_SEEN_OBJ_ID when running Immix #101

Closed wks closed 2 months ago

wks commented 2 months ago

The error occurs when running test-all with release build and Immix plan. See:

It doesn't seem to be related to the API change related to ObjectReference

  [BUG] rb_gc_impl_object_id: FL_SEEN_OBJ_ID flag set but not found in table
  ruby 3.4.0dev (2024-09-02T01:18:44Z :detached: cd5e3561b6) +MMTk(Immix) [x86_64-linux]

  -- C level backtrace information -------------------------------------------
  TestAst#"test_all_tokens:test/rubygems/test_gem_commands_setup_command.rb" = 0.56 s = .
  TestThread#test_thread_variables = 0.00 s = .
  TestThread#test_priority = 0.88 s = .
  TestThread#test_handle_interrupt = 0.01 s = .
  TestThread#test_list = 0.12 s = .
  TestThread#test_thread_variable_frozen = 0.01 s = .
  TestThread#test_thread_local_dynamic_symbol = 0.00 s = .
  /home/runner/work/mmtk-ruby/mmtk-ruby/git/ruby/build/ruby(rb_print_backtrace+0x14) [0x5603145bfe11] ../vm_dump.c:824
  /home/runner/work/mmtk-ruby/mmtk-ruby/git/ruby/build/ruby(rb_vm_bugreport) ../vm_dump.c:1155
  /home/runner/work/mmtk-ruby/mmtk-ruby/git/ruby/build/ruby(bug_report_end+0x0) [0x560314578200] ../error.c:1095
  /home/runner/work/mmtk-ruby/mmtk-ruby/git/ruby/build/ruby(rb_bug_without_die) ../error.c:1095
  /home/runner/work/mmtk-ruby/mmtk-ruby/git/ruby/build/ruby(die+0x0) [0x5603141b5e13] ../error.c:1103
  /home/runner/work/mmtk-ruby/mmtk-ruby/git/ruby/build/ruby(rb_bug) ../error.c:1105
  /home/runner/work/mmtk-ruby/mmtk-ruby/git/ruby/build/ruby(gc_verify_heap_page+0x0) [0x5603141a67b5] ../gc/default.c:1762
  /home/runner/work/mmtk-ruby/mmtk-ruby/git/ruby/build/ruby(gc_verify_heap_pages_) ../gc/default.c:5965
  /home/runner/work/mmtk-ruby/mmtk-ruby/git/ruby/build/ruby(rb_find_object_id+0xf) [0x5603141e2770] ../gc.c:1695
  /home/runner/work/mmtk-ruby/mmtk-ruby/git/ruby/build/ruby(rb_obj_id) ../gc.c:1750
  /home/runner/work/mmtk-ruby/mmtk-ruby/git/ruby/build/ruby(rb_mmtk_make_finalize_job) ../gc/default.c:3039
  /home/runner/work/mmtk-ruby/mmtk-ruby/git/ruby/build/ruby(rb_mmtk_on_finalizer_table_delete) ../gc/default.c:10528
  /home/runner/work/mmtk-ruby/mmtk-ruby/git/ruby/build/ruby(rb_mmtk_update_weak_table_migrate_each+0x85) [0x56031423e975] ../mmtk_support.c:929
  /home/runner/work/mmtk-ruby/mmtk-ruby/git/ruby/build/ruby(apply_functor+0x13) [0x5603142f0795] ../st.c:1638
  /home/runner/work/mmtk-ruby/mmtk-ruby/git/ruby/build/ruby(st_general_foreach) ../st.c:1548
  /home/runner/work/mmtk-ruby/mmtk-ruby/git/ruby/build/ruby(rb_st_foreach) ../st.c:1645
  /home/runner/work/mmtk-ruby/mmtk-ruby/git/ruby/build/ruby(rb_mmtk_update_weak_table+0x98) [0x56031423fb68] ../mmtk_support.c:1033
  /home/runner/work/mmtk-ruby/mmtk-ruby/git/mmtk-ruby/mmtk/target/release/libmmtk_ruby.so(_ZN4mmtk9scheduler4work6GCWork17do_work_with_stat17h04152c7aba2f0330E+0x3f) [0x7fb766142e8f]
  /home/runner/work/mmtk-ruby/mmtk-ruby/git/mmtk-ruby/mmtk/target/release/libmmtk_ruby.so(_ZN3std10sys_common9backtrace28__rust_begin_short_backtrace17h0022c5c0bf90c7c7E+0x9a3) [0x7fb76606e233]
  /home/runner/work/mmtk-ruby/mmtk-ruby/git/mmtk-ruby/mmtk/target/release/libmmtk_ruby.so(_ZN4core3ops8function6FnOnce40call_once$u7b$$u7b$vtable.shim$u7d$$u7d$17h16e8858e648eb9bbE+0xab) [0x7fb7661342cb]
  /home/runner/work/mmtk-ruby/mmtk-ruby/git/mmtk-ruby/mmtk/target/release/libmmtk_ruby.so(_ZN3std3sys3pal4unix6thread6Thread3new12thread_start17h2770ac7f8882db09E+0x16) [0x7fb766220796]
  /lib/x86_64-linux-gnu/libc.so.6(0x7fb765c94ac3) [0x7fb765c94ac3]
  /lib/x86_64-linux-gnu/libc.so.6(0x7fb765d26850) [0x7fb765d26850]
wks commented 2 months ago

This bug is caused by the mmtk-ruby binding sometimes clearing the obj_to_id_table before processing the finalizer_table. Currently, the Rust part creates two work packets to process the obj_to_id_table and finalizer_table in parallel, but this is incorrect. While processing the finalizer table, it needs to create "final jobs" for dead objects. That needs to look up the obj_to_id_table to find its ID. But if we already cleared the obj_to_id_table, we will see an object having the FL_SEEN_OBJ_ID flag,but not having an entry in the obj_to_id_table.

The fix is serializing the processing of those tables, i.e., process the finalizer_table and then process the obj_to_id_table.