oracle / truffleruby

A high performance implementation of the Ruby programming language, built on GraalVM.
https://www.graalvm.org/ruby/
Other
2.99k stars 180 forks source link

Rare internal exception loading `google-protobuf` #3500

Closed ntkme closed 3 months ago

ntkme commented 3 months ago

This is happening extremely rarely when loading google-protobuf-4.26.0 on truffleruby+graalvm-24.0.0.

What's interesting is that it is throwing from just loading the cext with require 'google/protobuf': https://github.com/protocolbuffers/protobuf/blob/v26.0/ruby/lib/google/protobuf_native.rb#L15

dead handle 0xbad000000018070 (com.oracle.truffle.api.CompilerDirectives.ShouldNotReachHere)
    from com.oracle.truffle.api.CompilerDirectives.shouldNotReachHere(CompilerDirectives.java:574)
    from com.oracle.truffle.api.CompilerDirectives.shouldNotReachHere(CompilerDirectives.java:520)
    from org.truffleruby.cext.UnwrapNode$UnwrapNativeNode.raiseError(UnwrapNode.java:107)
    from org.truffleruby.cext.UnwrapNode$UnwrapNativeNode.unwrapTaggedObject(UnwrapNode.java:92)
    from org.truffleruby.cext.UnwrapNodeGen$UnwrapNativeNodeGen$Inlined.execute(UnwrapNodeGen.java:377)
    from org.truffleruby.cext.UnwrapNode.longToWrapper(UnwrapNode.java:270)
    from org.truffleruby.cext.UnwrapNodeGen$Inlined.execute(UnwrapNodeGen.java:143)
    from org.truffleruby.cext.ValueWrapperManager$UnwrapperFunction.execute(ValueWrapperManager.java:401)
    from org.truffleruby.cext.UnwrapperFunctionGen$InteropLibraryExports$Cached.execute(UnwrapperFunctionGen.java:117)
    from com.oracle.truffle.llvm.runtime.nodes.func.LLVMDispatchNode$LLVMLookupDispatchForeignNode.doGeneric(LLVMDispatchNode.java:459)
    from com.oracle.truffle.llvm.runtime.nodes.func.LLVMDispatchNode$LLVMLookupDispatchForeignNode.doUnknownType(LLVMDispatchNode.java:487)
    from com.oracle.truffle.llvm.runtime.nodes.func.LLVMDispatchNodeGen$LLVMLookupDispatchForeignNodeGen.execute(LLVMDispatchNodeGen.java:1471)
    from com.oracle.truffle.llvm.runtime.nodes.func.LLVMDispatchNode.doForeignExecutable(LLVMDispatchNode.java:380)
    from com.oracle.truffle.llvm.runtime.nodes.func.LLVMDispatchNodeGen.executeDispatch(LLVMDispatchNodeGen.java:272)
    from com.oracle.truffle.llvm.runtime.nodes.func.LLVMCallNode.doCall(LLVMCallNode.java:82)
    from com.oracle.truffle.llvm.runtime.nodes.func.LLVMCallNodeGen.executeGeneric(LLVMCallNodeGen.java:37)
    from com.oracle.truffle.llvm.runtime.nodes.api.LLVMFrameNullerExpression.doGeneric(LLVMFrameNullerExpression.java:71)
    from com.oracle.truffle.llvm.runtime.nodes.api.LLVMFrameNullerExpressionNodeGen.executeGeneric(LLVMFrameNullerExpressionNodeGen.java:29)
    from com.oracle.truffle.llvm.runtime.nodes.vars.LLVMWriteNodeFactory$LLVMWritePointerNodeGen.execute_generic1(LLVMWriteNodeFactory.java:1370)
    from com.oracle.truffle.llvm.runtime.nodes.vars.LLVMWriteNodeFactory$LLVMWritePointerNodeGen.execute(LLVMWriteNodeFactory.java:1344)
    from com.oracle.truffle.llvm.runtime.nodes.base.LLVMBasicBlockNode$InitializedBlockNode.execute(LLVMBasicBlockNode.java:154)
    from com.oracle.truffle.llvm.runtime.nodes.control.LLVMDispatchBasicBlockNode.dispatchFromBasicBlock(LLVMDispatchBasicBlockNode.java:116)
    from com.oracle.truffle.llvm.runtime.nodes.control.LLVMDispatchBasicBlockNode.doDispatch(LLVMDispatchBasicBlockNode.java:87)
    from com.oracle.truffle.llvm.runtime.nodes.control.LLVMDispatchBasicBlockNodeGen.executeGeneric(LLVMDispatchBasicBlockNodeGen.java:33)
    from com.oracle.truffle.llvm.runtime.nodes.control.LLVMFunctionRootNode.doRun(LLVMFunctionRootNode.java:81)
    from com.oracle.truffle.llvm.runtime.nodes.control.LLVMFunctionRootNodeGen.executeGeneric(LLVMFunctionRootNodeGen.java:34)
    from com.oracle.truffle.llvm.runtime.nodes.func.LLVMFunctionStartNode.execute(LLVMFunctionStartNode.java:102)
/home/runner/.rubies/truffleruby+graalvm-24.0.0/lib/truffle/truffle/cext.rb:2248:in `block in resolve_registered_addresses'
    from /home/runner/.rubies/truffleruby+graalvm-24.0.0/lib/truffle/truffle/cext.rb:2247:in `each'
    from /home/runner/.rubies/truffleruby+graalvm-24.0.0/lib/truffle/truffle/cext.rb:2247:in `resolve_registered_addresses'
    from /home/runner/.rubies/truffleruby+graalvm-24.0.0/lib/truffle/truffle/cext.rb:220:in `init_extension'
    from <internal:core> core/kernel.rb:229:in `gem_original_require'
    from <internal:/home/runner/.rubies/truffleruby+graalvm-24.0.0/lib/mri/rubygems/core_ext/kernel_require.rb>:37:in `require'
    from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/vendor/bundle/truffleruby/3.2.2.24.0.0.2/gems/google-protobuf-4.26.0/lib/google/protobuf_native.rb:15:in `<top (required)>'
    from <internal:core> core/kernel.rb:229:in `gem_original_require'
    from <internal:/home/runner/.rubies/truffleruby+graalvm-24.0.0/lib/mri/rubygems/core_ext/kernel_require.rb>:37:in `require'
    from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/vendor/bundle/truffleruby/3.2.2.24.0.0.2/gems/google-protobuf-4.26.0/lib/google/protobuf.rb:57:in `<module:Protobuf>'
    from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/vendor/bundle/truffleruby/3.2.2.24.0.0.2/gems/google-protobuf-4.26.0/lib/google/protobuf.rb:15:in `<module:Google>'
    from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/vendor/bundle/truffleruby/3.2.2.24.0.0.2/gems/google-protobuf-4.26.0/lib/google/protobuf.rb:14:in `<top (required)>'
    from <internal:core> core/kernel.rb:229:in `gem_original_require'
    from <internal:/home/runner/.rubies/truffleruby+graalvm-24.0.0/lib/mri/rubygems/core_ext/kernel_require.rb>:37:in `require'
    from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/ext/sass/embedded_sass_pb.rb:5:in `<top (required)>'
    from <internal:core> core/kernel.rb:292:in `require_relative'
    from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/lib/sass/embedded_protocol.rb:6:in `<module:EmbeddedProtocol>'
    from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/lib/sass/embedded_protocol.rb:5:in `<module:Sass>'
    from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/lib/sass/embedded_protocol.rb:3:in `<top (required)>'
    from <internal:core> core/kernel.rb:292:in `require_relative'
    from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/lib/sass/compiler.rb:11:in `<top (required)>'
    from <internal:core> core/kernel.rb:292:in `require_relative'
    from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/lib/sass/embedded.rb:3:in `<top (required)>'
    from <internal:core> core/kernel.rb:292:in `require_relative'
    from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/lib/sass-embedded.rb:4:in `<top (required)>'
    from <internal:core> core/kernel.rb:229:in `gem_original_require'
    from <internal:/home/runner/.rubies/truffleruby+graalvm-24.0.0/lib/mri/rubygems/core_ext/kernel_require.rb>:37:in `require'
    from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/spec/spec_helper.rb:3:in `<top (required)>'
    from <internal:core> core/kernel.rb:229:in `gem_original_require'
    from <internal:/home/runner/.rubies/truffleruby+graalvm-24.0.0/lib/mri/rubygems/core_ext/kernel_require.rb>:37:in `require'
    from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/spec/sass/compile_error_spec.rb:3:in `<top (required)>'
    from <internal:core> core/kernel.rb:378:in `load'
    from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/vendor/bundle/truffleruby/3.2.2.24.0.0.2/gems/rspec-core-3.13.0/lib/rspec/core/configuration.rb:2138:in `load_file_handling_errors'
    from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/vendor/bundle/truffleruby/3.2.2.24.0.0.2/gems/rspec-core-3.13.0/lib/rspec/core/configuration.rb:1638:in `block in load_spec_files'
    from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/vendor/bundle/truffleruby/3.2.2.24.0.0.2/gems/rspec-core-3.13.0/lib/rspec/core/configuration.rb:1636:in `each'
    from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/vendor/bundle/truffleruby/3.2.2.24.0.0.2/gems/rspec-core-3.13.0/lib/rspec/core/configuration.rb:1636:in `load_spec_files'
    from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/vendor/bundle/truffleruby/3.2.2.24.0.0.2/gems/rspec-core-3.13.0/lib/rspec/core/runner.rb:102:in `setup'
    from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/vendor/bundle/truffleruby/3.2.2.24.0.0.2/gems/rspec-core-3.13.0/lib/rspec/core/runner.rb:86:in `run'
    from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/vendor/bundle/truffleruby/3.2.2.24.0.0.2/gems/rspec-core-3.13.0/lib/rspec/core/runner.rb:71:in `run'
    from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/vendor/bundle/truffleruby/3.2.2.24.0.0.2/gems/rspec-core-3.13.0/lib/rspec/core/runner.rb:45:in `invoke'
    from /home/runner/work/sass-embedded-host-ruby/sass-embedded-host-ruby/vendor/bundle/truffleruby/3.2.2.24.0.0.2/gems/rspec-core-3.13.0/exe/rspec:4:in `<main>'
ntkme commented 3 months ago

So far this happened twice on Github Actions x86_64 macOS runner. I have tired to reproduce this locally on arm64 macOS, and was not able to reproduce after running for a full day. I guess this issue might have to do with specific architecture and how cext was compiled on that architecture.

ntkme commented 3 months ago

https://github.com/sass-contrib/sass-embedded-host-ruby/actions/runs/8387041233/job/22968472183

eregon commented 3 months ago

I assume tag v26.0 is the same as the gem 4.26.0?

This is the Init_ function of protobuf: https://github.com/protocolbuffers/protobuf/blob/v26.0/ruby/ext/google/protobuf_c/protobuf.c#L336-L356 I don't see an immediate usage of Float or not-fixnum-long in there, so it doesn't seem to be the same cause as #3478.

However the fact it happens in resolve_registered_addresses makes me think it might be related, and in fact the first change in lib/truffle/truffle/cext.rb of https://github.com/oracle/truffleruby/commit/f40067b3c7857a2a0377d42b2e3fa540006f2b1a#diff-f48865e98d79bb21eb524e1ff1d15b468fa1622fa9b65d4b83471e1d4b8a9486 might fix this. Specifically on line 222 the comment explains we now resolve while we have all preserved objects which should be safer, before I think some handles could possibly GC in between.

So this might be solved on truffleruby master.

ntkme commented 3 months ago

I assume tag v26.0 is the same as the gem 4.26.0?

Yes, more or less the same thing. 26.0 is the protoc compiler version, 4.26.0 is the ruby gem version.

ntkme commented 3 months ago

Seems to be fixed by truffleruby-24.0.1. Closing for now, and will reopen if reproduced again.