oracle / truffleruby

A high performance implementation of the Ruby programming language, built on GraalVM.
https://www.graalvm.org/ruby/
Other
3.02k stars 185 forks source link

Problems compiling the rice gem #3357

Open mtortonesi opened 10 months ago

mtortonesi commented 10 months ago

I am trying to get the rice gem working with truffleruby. I know that rice leverges MRI internals, but I wanted to check if it was possible to make it work with truffleruby with a moderate development effort.

Unfortunately, before I even arrived to identify which MRI internal functions required by rice are missing in truffleruby, I got some very weird compilation errors. It seems that the linker does not find functions that should be in the truffleruby dynamic library. Here is an example of the (many) errors I get:

ld64.lld: error: undefined symbol: rb_type
>>> referenced by /Users/mauro/code/test/truffleruby-rice/rice/test/embed_ruby.cpp
>>>               /tmp/lto.tmp:(symbol test__default_construct()+0x80)
>>> referenced by /Users/mauro/code/test/truffleruby-rice/rice/test/embed_ruby.cpp
>>>               /tmp/lto.tmp:(symbol Rice::Module::Module(unsigned long)+0x40)
>>> referenced by /Users/mauro/code/test/truffleruby-rice/rice/test/embed_ruby.cpp
>>>               /tmp/lto.tmp:(symbol Rice::Module::Module(unsigned long)+0x30)
>>> referenced 1694 more times

ld64.lld: error: too many errors emitted, stopping now (use --error-limit=0 to see all errors)
clang-16: error: linker command failed with exit code 1 (use -v to see invocation)

I checked with nm and most of the symbols that the linker complains about are apparently in the truffleruby dynamic library:

~❯ nm /Users/mauro/code/git/truffleruby-ws/truffleruby/mxbuild/truffleruby-jvm-ce/languages/ruby/lib/cext/libtruffleruby.dylib | grep rb_type
000000000000dba4 T _rb_type
00000000000053bc T _rb_typeddata_inherited_p
00000000000053dc T _rb_typeddata_is_kind_of

Here is what I did, in details:

I compiled truffleruby using the instructions in the Truffleruby Contributor Workflow document:

cd ~/code/git
mkdir truffleruby-ws
cd truffleruby-ws
git clone https://github.com/oracle/truffleruby.git
cd truffleruby
alias jt=$(PWD)/bin/jt
jt build --env jvm-ce

I then did install the new version of truffleruby in rbenv for convenience:

cd .rbenv/versions
ln -s $(HOME)/code/git/truffleruby-ws/truffleruby/mxbuild/truffleruby-jvm-ce/languages/ruby truffleruby-jvm-ce 

I downloaded rice and I set up rbenv to use truffleruby-jvm-ce:

cd ~/code/git
git clone https://github.com/jasonroelofs/rice.git
cd rice
rbenv local truffleruby-jvm-ce

I then tried to build rice:

bundle exec rake test

I am using the system compiler (clang) on a 2020 M1 MacbookPro:

> clang --version
Apple clang version 15.0.0 (clang-1500.0.40.1)
Target: arm64-apple-darwin23.1.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin
eregon commented 10 months ago

Have you tried installing the gem? That works fine for me on Linux:

$ ruby -v
truffleruby 24.0.0-dev-5a199d82, like ruby 3.2.2, GraalVM CE Native [x86_64-linux]
$ gem i rice
Fetching rice-4.1.0.gem
Successfully installed rice-4.1.0
1 gem installed

And it also works fine when using truffleruby-jvm built locally.

I tried to bundle exec rake test in a clone of rice, but that fails, even on CRuby with:

$ be rake test                 
cd /home/eregon/code/rice/test
make: *** No rule to make target 'clean'.  Stop.
rake aborted!
Failed
/home/eregon/code/rice/Rakefile:12:in `block in run_command'
/home/eregon/code/rice/Rakefile:8:in `run_command'
/home/eregon/code/rice/Rakefile:46:in `block (3 levels) in <top (required)>'
/home/eregon/code/rice/Rakefile:45:in `block (2 levels) in <top (required)>'
/home/eregon/code/rice/Rakefile:43:in `each'
/home/eregon/code/rice/Rakefile:43:in `block in <top (required)>'
/home/eregon/code/rice/vendor/bundle/ruby/3.2.0/gems/rake-13.1.0/exe/rake:27:in `<top (required)>'
/home/eregon/.rubies/ruby-3.2.2/bin/bundle:25:in `load'
/home/eregon/.rubies/ruby-3.2.2/bin/bundle:25:in `<main>'
Tasks: TOP => test => test_cpp => clean
(See full trace by running task with --trace)

I would recommend to try with a build of truffleruby-dev, e.g. using ruby-build as described in the README. As least that will exclude any local-build-of-truffleruby issue.

eregon commented 10 months ago

According to https://github.com/jasonroelofs/rice/blob/master/.github/workflows/testing.yml, one needs:

bundle exec rake headers
bundle exec rake build
bundle exec rake test

(that's rather unusual, but anyway)

Using truffleruby 24.0.0-dev-5a199d82, like ruby 3.2.2, GraalVM CE Native [x86_64-linux],

$ be rake headers
Building rice.hpp
Building stl.hpp
Success

$ be rake build  
Building rice.hpp
Building stl.hpp
Success
cd /home/eregon/code/rice/test
checking for rice/rice.hpp in /home/eregon/code/rice/include... yes
checking for -lstdc++... yes
creating Makefile
compiling embed_ruby.cpp
compiling test_Address_Registration_Guard.cpp
compiling test_Array.cpp
compiling test_Attribute.cpp
compiling test_Builtin_Object.cpp
compiling test_Class.cpp
compiling test_Constructor.cpp
compiling test_Data_Object.cpp
In file included from test_Builtin_Object.cpp:1:
test_Builtin_Object.cpp: In function ‘void test__arrow()’:
test_Builtin_Object.cpp:83:37: error: ‘struct RBasic’ has no member named ‘klass’
   83 |   ASSERT_EQUAL(rb_cObject, b->basic.klass);
      |                                     ^~~~~
unittest.hpp:270:24: note: in definition of macro ‘ASSERT_EQUAL’
  270 |     assert_equal((x), (y), #x, #y, __FILE__, __LINE__); \
      |                        ^
make: *** [Makefile:478: test_Builtin_Object.o] Error 1
make: *** Waiting for unfinished jobs....
rake aborted!
Failed

So the first issue there is the usage of RBasic->klass, which should be replaced by the RBASIC_CLASS macro.

eregon commented 10 months ago

If I comment out that line in test_Builtin_Object.cpp, then bundle exec rake build continues and indeed the linker fails with a lot of output like:

/home/eregon/code/rice/include/rice/rice.hpp:4220:47: warning: function may return address of local variable [-Wreturn-local-addr]
 4220 |       return From_Ruby<Class_T>().convert(self);
      |                                               ^
/home/eregon/code/rice/include/rice/rice.hpp:4220:14: note: declared here
 4220 |       return From_Ruby<Class_T>().convert(self);
      |              ^~~~~~~~~~~~~~~~~~~~
linking executable unittest
/usr/bin/ld: embed_ruby.o: in function `embed_ruby()':
/home/eregon/code/rice/test/embed_ruby.cpp:14: undefined reference to `ruby_sysinit'
/usr/bin/ld: /home/eregon/code/rice/test/embed_ruby.cpp:15: undefined reference to `ruby_init'
/usr/bin/ld: /home/eregon/code/rice/test/embed_ruby.cpp:16: undefined reference to `ruby_init_loadpath'
/usr/bin/ld: embed_ruby.o: in function `Rice::Exception::what() const':
/home/eregon/.rubies/truffleruby-dev/lib/cext/include/ruby/internal/symbol.h:279: undefined reference to `rb_intern2'
/usr/bin/ld: embed_ruby.o: in function `Rice::Exception::what() const':
/home/eregon/code/rice/include/rice/rice.hpp:3778: undefined reference to `rb_funcallv'
/usr/bin/ld: embed_ruby.o: in function `Rice::Exception::what() const':
/home/eregon/.rubies/truffleruby-dev/lib/cext/include/ruby/internal/core/rstring.h:494: undefined reference to `rb_tr_str_len'
/usr/bin/ld: /home/eregon/.rubies/truffleruby-dev/lib/cext/include/ruby/internal/core/rstring.h:512: undefined reference to `rb_tr_rstring_ptr'
/usr/bin/ld: test_Address_Registration_Guard.o: in function `Rice::detail::RubyFunction<void (*)(void (*)(unsigned long), unsigned long), void (*)(unsigned long), unsigned long>::RubyFunction(void (*)(void (*)(unsigned long), unsigned long), void (* const&)(unsigned long), unsigned long const&)':
/home/eregon/code/rice/include/rice/rice.hpp:447: undefined reference to `rb_set_end_proc'
/usr/bin/ld: test_Address_Registration_Guard.o: in function `Rice::detail::RubyFunction<void (*)(void (*)(unsigned long), unsigned long), void (*)(unsigned long), unsigned long>::operator()()':
/home/eregon/code/rice/include/rice/rice.hpp:485: undefined reference to `rb_protect'
/usr/bin/ld: test_Address_Registration_Guard.o: in function `Rice::detail::RubyFunction<void (*)(unsigned long*), unsigned long*>::RubyFunction(void (*)(unsigned long*), unsigned long* const&)':
/home/eregon/code/rice/include/rice/rice.hpp:447: undefined reference to `rb_gc_register_address'
/usr/bin/ld: test_Address_Registration_Guard.o: in function `Rice::detail::RubyFunction<void (*)(unsigned long*), unsigned long*>::operator()()':
/home/eregon/code/rice/include/rice/rice.hpp:485: undefined reference to `rb_protect'
/usr/bin/ld: test_Address_Registration_Guard.o: in function `Rice::detail::RubyFunction<void (*)(unsigned long*), unsigned long*>::RubyFunction(void (*)(unsigned long*), unsigned long* const&)':
/home/eregon/code/rice/include/rice/rice.hpp:447: undefined reference to `rb_gc_unregister_address'
/usr/bin/ld: test_Address_Registration_Guard.o: in function `Rice::detail::RubyFunction<void (*)(unsigned long*), unsigned long*>::operator()()':
/home/eregon/code/rice/include/rice/rice.hpp:485: undefined reference to `rb_protect'
/usr/bin/ld: /home/eregon/code/rice/include/rice/rice.hpp:497: undefined reference to `rb_errinfo'
/usr/bin/ld: /home/eregon/code/rice/include/rice/rice.hpp:500: undefined reference to `rb_set_errinfo'
/usr/bin/ld: /home/eregon/code/rice/include/rice/rice.hpp:497: undefined reference to `rb_errinfo'
/usr/bin/ld: /home/eregon/code/rice/include/rice/rice.hpp:500: undefined reference to `rb_set_errinfo'
/usr/bin/ld: test_Address_Registration_Guard.o: in function `Rice::detail::RubyFunction<void (*)(void (*)(unsigned long), unsigned long), void (*)(unsigned long), unsigned long>::RubyFunction(void (*)(void (*)(unsigned long), unsigned long), void (* const&)(unsigned long), unsigned long const&)':
...

ruby_sysinit/ruby_init/ruby_init_loadpath are functions to embed CRuby, i.e., not to run the bin/ruby executable but embed it in another program. Those are not implemented in TruffleRuby (maybe they could, but it seems a big effort for something almost never used, and there is already a way to embed TruffleRuby in C/C++ using JNI). From the Rice README:

Rice is a C++ header-only library that serves dual purposes. First, it makes it much easier to create Ruby bindings for existing C++ libraries. Second, it provides an object oriented interface to Ruby's C API that makes it easy to embed Ruby and write Ruby extensions in C++.

I don't think embedding Ruby using Rice is something used much, so I would suggest to ignore that part and focus on what Rice is mainly used for: C++ extensions for Ruby.

BTW, what do you want to use Rice for on truffleruby? Some specific gem using Rice I guess, which one?

Regarding the missing symbols which TruffleRuby does implement like rb_intern2/rb_funcallv/rb_tr_str_len, that is likely because libtrufflerubytrampoline.{so,dylib} is not part of the linker command. Probably due to this: https://github.com/jasonroelofs/rice/blob/b7bae38b70d237acdc1724191f5ccf3113212a00/test/extconf.rb#L5-L7 So it seems Rice tests rely on embedding Ruby and creating an executable that inside starts an embedded Ruby. I think that's an overly complicated setup and is likely not the best way to get Rice to work on TruffleRuby.

So instead I would suggest to focus on a specific gem using Rice, and the errors/issues when installing/using it on TruffleRuby.

mtortonesi commented 10 months ago

First of all, thank you so very much for your kind response (and your time).

Second, thank you for making me realize I was thoughtlessly and unnecessarily banging my head against the wall. Of course, as you correctly guessed, I am not interested in making Truffleruby embeddable like MRI for the purpose of running the rice test set. My objective is instead to get the torch.rb gem working on Truffleruby. I think Truffleruby has a great potential for a lot of ML use cases and it would be nice being able to run what is arguably the leading ML library on top of it.

After your feedback, I created a new setup with a simple script using torch.rb. Doing so, I realized that the first step towards getting torch.rb working on Truffleruby was to implement the rb_frame_method_and_id internal. So I did just that and submitted PR #3363. The patch seems to work for me. However, I am not sure if my code is 100% correct and I am not sure what kind of spec - if any - one could add to test this code in an effective way. Could I please ask you to give it a look?

mtortonesi commented 10 months ago

I might have spoken too soon. It looks like torch.rb has other issues in addition to the lack of rb_frame_method_id_and_class internal API.

For instance, take a look at this:

~/code/test/diffusion.rb > bundle exec ruby -e "require 'torch'; puts Torch::CUDA.available?"

truffleruby: an internal exception escaped out of the interpreter,
please report it to https://github.com/oracle/truffleruby/issues.

dead handle 0x294fffff8 (com.oracle.truffle.api.CompilerDirectives.ShouldNotReachHere)
    from com.oracle.truffle.api.CompilerDirectives.shouldNotReachHere(CompilerDirectives.java:574)
    from com.oracle.truffle.api.CompilerDirectives.shouldNotReachHere(CompilerDirectives.java:520)
    from org.truffleruby.cext.UnwrapNode$UnwrapNativeNode.raiseError(UnwrapNode.java:107)
    from org.truffleruby.cext.UnwrapNode$UnwrapNativeNode.unwrapTaggedObject(UnwrapNode.java:92)
    from org.truffleruby.cext.UnwrapNodeGen$UnwrapNativeNodeGen$Inlined.execute(UnwrapNodeGen.java:377)
    from org.truffleruby.cext.UnwrapNode.longToWrapper(UnwrapNode.java:270)
    from org.truffleruby.cext.UnwrapNodeGen$Inlined.execute(UnwrapNodeGen.java:143)
    from org.truffleruby.cext.ValueWrapperManager$UnwrapperFunction.execute(ValueWrapperManager.java:407)
    from org.truffleruby.cext.UnwrapperFunctionGen$InteropLibraryExports$Cached.execute(UnwrapperFunctionGen.java:123)
    from com.oracle.truffle.llvm.runtime.nodes.func.LLVMDispatchNode$LLVMLookupDispatchForeignNode.doGeneric(LLVMDispatchNode.java:459)
    from com.oracle.truffle.llvm.runtime.nodes.func.LLVMDispatchNode$LLVMLookupDispatchForeignNode.doUnknownType(LLVMDispatchNode.java:487)
    from com.oracle.truffle.llvm.runtime.nodes.func.LLVMDispatchNodeGen$LLVMLookupDispatchForeignNodeGen.executeAndSpecialize(LLVMDispatchNodeGen.java:1664)
    from com.oracle.truffle.llvm.runtime.nodes.func.LLVMDispatchNodeGen$LLVMLookupDispatchForeignNodeGen.execute(LLVMDispatchNodeGen.java:1487)
    from com.oracle.truffle.llvm.runtime.nodes.func.LLVMDispatchNode.doForeignExecutable(LLVMDispatchNode.java:380)
    from com.oracle.truffle.llvm.runtime.nodes.func.LLVMDispatchNodeGen.executeAndSpecialize(LLVMDispatchNodeGen.java:723)
    from com.oracle.truffle.llvm.runtime.nodes.func.LLVMDispatchNodeGen.executeDispatch(LLVMDispatchNodeGen.java:305)
    from com.oracle.truffle.llvm.runtime.nodes.func.LLVMCallNode.doCall(LLVMCallNode.java:82)
    from com.oracle.truffle.llvm.runtime.nodes.func.LLVMCallNodeGen.executeGeneric(LLVMCallNodeGen.java:37)
    from com.oracle.truffle.llvm.runtime.nodes.api.LLVMFrameNullerExpression.doGeneric(LLVMFrameNullerExpression.java:71)
    from com.oracle.truffle.llvm.runtime.nodes.api.LLVMFrameNullerExpressionNodeGen.executeGeneric(LLVMFrameNullerExpressionNodeGen.java:29)
    from com.oracle.truffle.llvm.runtime.nodes.vars.LLVMWriteNodeFactory$LLVMWritePointerNodeGen.execute_generic1(LLVMWriteNodeFactory.java:1370)
    from com.oracle.truffle.llvm.runtime.nodes.vars.LLVMWriteNodeFactory$LLVMWritePointerNodeGen.execute(LLVMWriteNodeFactory.java:1344)
    from com.oracle.truffle.llvm.runtime.nodes.base.LLVMBasicBlockNode$InitializedBlockNode.execute(LLVMBasicBlockNode.java:154)
    from com.oracle.truffle.llvm.runtime.nodes.control.LLVMDispatchBasicBlockNode.dispatchFromBasicBlock(LLVMDispatchBasicBlockNode.java:116)
    from com.oracle.truffle.llvm.runtime.nodes.control.LLVMDispatchBasicBlockNode.doDispatch(LLVMDispatchBasicBlockNode.java:87)
    from com.oracle.truffle.llvm.runtime.nodes.control.LLVMDispatchBasicBlockNodeGen.executeGeneric(LLVMDispatchBasicBlockNodeGen.java:33)
    from com.oracle.truffle.llvm.runtime.nodes.control.LLVMFunctionRootNode.doRun(LLVMFunctionRootNode.java:81)
    from com.oracle.truffle.llvm.runtime.nodes.control.LLVMFunctionRootNodeGen.executeGeneric(LLVMFunctionRootNodeGen.java:34)
    from com.oracle.truffle.llvm.runtime.nodes.func.LLVMFunctionStartNode.execute(LLVMFunctionStartNode.java:102)
/Users/mauro/code/test/diffusion.rb/vendor/bundle/truffleruby/3.2.2.4/gems/rice-4.1.0/include/rice/rice.hpp:1137:in `call'
    from /Users/mauro/code/git/truffleruby-ws/graal/sdk/mxbuild/darwin-aarch64/GRAALVM_F2FA4FF4E3_JAVA21/graalvm-f2fa4ff4e3-java21-24.0.0-dev/Contents/Home/languages/ruby/lib/truffle/truffle/cext_ruby.rb:42:in `available?'
    from -e:1:in `<main>'

I am willing to try to fix these bugs, but I wouldn't know where to start. Do you have any suggestions?

mtortonesi commented 10 months ago

I also tried to run the rice sample_callback example and ran into another issue (the lines before the exception are $stderr.puts/System.err.printlns that I added to the code to try understanding what is going on):

~/c/g/mtortonesi/rice/s/callbacks master !2 > ruby test.rb
[...]
callWithCExtLockAndFrameAndUnwrapNode
receiver: 0x1048f48f0
args: [Ljava.lang.Object;@4fd44098
receivers: InteropLibraryGen.CachedDispatchFirst@1c4b9878
translateInteropExceptionNode: TranslateInteropExceptionNodeGen.Inlined@626dfe75
Truffle::Interop node: CExtNodesFactory.CallWithCExtLockAndFrameAndUnwrapNodeFactory.CallWithCExtLockAndFrameAndUnwrapNodeGen@531e9c96 at /Users/mauro/code/git/truffleruby-ws/graal/sdk/mxbuild/darwin-aarch64/GRAALVM_F2FA4FF4E3_JAVA21/graalvm-f2fa4ff4e3-java21-24.0.0-dev/Contents/Home/languages/ruby/lib/truffle/truffle/cext.rb:1305
Truffle::Interop receiver: 0x1048f48f0
Truffle::Interop arg: CallbackHolder
Truffle::Interop interop: InteropLibraryGen.CachedDispatchFirst@1c4b9878
Truffle::Interop translateInteropExceptionNode: TranslateInteropExceptionNodeGen.Inlined@626dfe75
LLVMSourceFunctionType::getSourceArgumentInformation(0)
LLVMSourceFunctionType::getSourceArgumentInformation sourceArgumentInformationList is null
LLVMSourceFunctionType::getSourceArgumentInformation returning null
LLVMForeignCallNode::create bitcodeParameterInfo: null
callWithCExtLockAndFrameAndUnwrapNode
receiver: 0x1048f4910
args: [Ljava.lang.Object;@4d57c66e
receivers: InteropLibraryGen.CachedDispatchFirst@44d06edc
translateInteropExceptionNode: TranslateInteropExceptionNodeGen.Inlined@626dfe75
Truffle::Interop node: CExtNodesFactory.CallWithCExtLockAndFrameAndUnwrapNodeFactory.CallWithCExtLockAndFrameAndUnwrapNodeGen@1e77807c at /Users/mauro/code/git/truffleruby-ws/graal/sdk/mxbuild/darwin-aarch64/GRAALVM_F2FA4FF4E3_JAVA21/graalvm-f2fa4ff4e3-java21-24.0.0-dev/Contents/Home/languages/ruby/lib/truffle/truffle/cext_ruby.rb:42
Truffle::Interop receiver: 0x1048f4910
Truffle::Interop arg: 0
Truffle::Interop arg: RubyBasicObject@35d474e0<Truffle::CExt::RArrayPtr>
Truffle::Interop arg: RubyBasicObject@bdb63c6<CallbackHolder>
Truffle::Interop interop: InteropLibraryGen.CachedDispatchFirst@44d06edc
Truffle::Interop translateInteropExceptionNode: TranslateInteropExceptionNodeGen.Inlined@626dfe75
LLVMSourceFunctionType::attachSourceArgumentInformation: 0, -1, 0, 32
LLVMSourceFunctionType constructor: 0, -1, 0, 32
LLVMSourceFunctionType::attachSourceArgumentInformation: 1, -1, 64, 64
LLVMSourceFunctionType constructor: 1, -1, 64, 64
LLVMSourceFunctionType::getSourceArgumentInformation(0)
LLVMSourceFunctionType::getSourceArgumentInformation sourceArgumentInformationList: 
SourceArgumentInformation(bcArgIdx=0, srcArgIdx=-1, offset=0, size=32)
SourceArgumentInformation(bcArgIdx=1, srcArgIdx=-1, offset=64, size=64)
LLVMSourceFunctionType::getSourceArgumentInformation info: SourceArgumentInformation(bcArgIdx=0, srcArgIdx=-1, offset=0, size=32)
LLVMForeignCallNode::create bitcodeParameterInfo: SourceArgumentInformation(bcArgIdx=0, srcArgIdx=-1, offset=0, size=32)
LLVMForeignCallNode::create sourceArgIndex: -1

truffleruby: an internal exception escaped out of the interpreter,
please report it to https://github.com/oracle/truffleruby/issues.

Index -1 out of bounds for length 3 (java.lang.ArrayIndexOutOfBoundsException)
    from com.oracle.truffle.llvm.runtime.interop.access.LLVMInteropType$Function.getParameter(LLVMInteropType.java:853)
    from com.oracle.truffle.llvm.runtime.interop.LLVMForeignCallNode$PackForeignArgumentsNode.create(LLVMForeignCallNode.java:128)
    from com.oracle.truffle.llvm.runtime.interop.LLVMForeignCallNode.<init>(LLVMForeignCallNode.java:204)
    from com.oracle.truffle.llvm.runtime.interop.LLVMForeignFunctionCallNode.<init>(LLVMForeignFunctionCallNode.java:44)
    from com.oracle.truffle.llvm.runtime.interop.LLVMForeignFunctionCallNode.create(LLVMForeignFunctionCallNode.java:48)
    from com.oracle.truffle.llvm.runtime.LLVMFunctionCode.initForeignCallTarget(LLVMFunctionCode.java:471)
    from com.oracle.truffle.llvm.runtime.LLVMFunctionCode.getForeignCallTarget(LLVMFunctionCode.java:480)
    from com.oracle.truffle.llvm.runtime.LLVMFunctionDescriptor$Execute.createCall(LLVMFunctionDescriptor.java:199)
    from com.oracle.truffle.llvm.runtime.LLVMFunctionDescriptorGen$InteropLibraryExports$Cached.executeAndSpecialize(LLVMFunctionDescriptorGen.java:190)
    from com.oracle.truffle.llvm.runtime.LLVMFunctionDescriptorGen$InteropLibraryExports$Cached.execute(LLVMFunctionDescriptorGen.java:165)
    from com.oracle.truffle.llvm.runtime.pointer.NativePointerLibraries$Execute.doNativeCached(NativePointerLibraries.java:81)
    from com.oracle.truffle.llvm.runtime.pointer.NativePointerLibrariesGen$InteropLibraryExports$Cached.executeAndSpecialize(NativePointerLibrariesGen.java:1925)
    from com.oracle.truffle.llvm.runtime.pointer.NativePointerLibrariesGen$InteropLibraryExports$Cached.execute(NativePointerLibrariesGen.java:1871)
    from com.oracle.truffle.api.interop.InteropLibraryGen$CachedDispatch.execute(InteropLibraryGen.java:7887)
    from org.truffleruby.interop.InteropNodes.execute(InteropNodes.java:91)
    from org.truffleruby.cext.CExtNodes$CallWithCExtLockAndFrameAndUnwrapNode.callWithCExtLockAndFrame(CExtNodes.java:265)
    from org.truffleruby.cext.CExtNodesFactory$CallWithCExtLockAndFrameAndUnwrapNodeFactory$CallWithCExtLockAndFrameAndUnwrapNodeGen.executeAndSpecialize(CExtNodesFactory.java:596)
    from org.truffleruby.cext.CExtNodesFactory$CallWithCExtLockAndFrameAndUnwrapNodeFactory$CallWithCExtLockAndFrameAndUnwrapNodeGen.execute(CExtNodesFactory.java:579)
    from org.truffleruby.language.locals.WriteLocalVariableNode.execute(WriteLocalVariableNode.java:28)
    from org.truffleruby.language.RubyNode.doExecuteVoid(RubyNode.java:64)
    from org.truffleruby.language.control.SequenceNode.execute(SequenceNode.java:34)
    from org.truffleruby.core.module.ModuleNodes$DefineMethodNode$CallMethodWithLambdaBody.execute(ModuleNodes.java:1377)
    from org.truffleruby.language.RubyLambdaRootNode.execute(RubyLambdaRootNode.java:84)
/Users/mauro/code/git/truffleruby-ws/graal/sdk/mxbuild/darwin-aarch64/GRAALVM_F2FA4FF4E3_JAVA21/graalvm-f2fa4ff4e3-java21-24.0.0-dev/Contents/Home/languages/ruby/lib/truffle/truffle/cext_ruby.rb:23:in `initialize'
    from test.rb:7:in `<main>'

Apparently, the member variable srcArgIdx of a SourceArgumentInformation instance is somehow assigned the value -1 and causes everything to blow up. However, I don't understand Truffleruby and rice enough to understand how to fix this.

Also, I don't see any correlation to the problem I reported above.

I think I hit a wall here. I really wouldn't know how to proceed. I would really appreciate some hints.

mtortonesi commented 10 months ago

Note that this is related to #3205.

mtortonesi commented 10 months ago

As I traced the problem down to a condition in LLVMForeignCallNode where bitcodeParameterInfo is not null and srcArgIndex is -1, I tried to apply (bovinely, as I really don't know what I am doing here) the following patch:

diff --git a/sulong/projects/com.oracle.truffle.llvm.runtime/src/com/oracle/truffle/llvm/runtime/interop/LLVMForeignCallNode.java b/sulong/projects/com.oracle.truffle.llvm.runtime/src/com/oracle/truffle/llvm/runtime/interop/LLVMForeignCallNode.java
index 2929cf89a44..21b5dfdc2c4 100644
--- a/sulong/projects/com.oracle.truffle.llvm.runtime/src/com/oracle/truffle/llvm/runtime/interop/LLVMForeignCallNode.java
+++ b/sulong/projects/com.oracle.truffle.llvm.runtime/src/com/oracle/truffle/llvm/runtime/interop/LLVMForeignCallNode.java
@@ -104,7 +104,7 @@ public abstract class LLVMForeignCallNode extends RootNode {

                     assert toLLVM[bitcodeArgIdx] == null;

-                    if (bitcodeParameterInfo == null) {
+                    if (bitcodeParameterInfo == null || bitcodeParameterInfo.getSourceArgIndex() == -1) {
                         int currentIdx = prevIdx + 1;

                         LLVMInteropType interopParameterType = interopFunctionType.getParameter(currentIdx);

but it doesn't work. I still get an internal exception:

truffleruby: an internal exception escaped out of the interpreter,
please report it to https://github.com/oracle/truffleruby/issues.

dead handle 0x144fffd50 (com.oracle.truffle.api.CompilerDirectives.ShouldNotReachHere)
    from com.oracle.truffle.api.CompilerDirectives.shouldNotReachHere(CompilerDirectives.java:574)
    from com.oracle.truffle.api.CompilerDirectives.shouldNotReachHere(CompilerDirectives.java:520)
    from org.truffleruby.cext.UnwrapNode$UnwrapNativeNode.raiseError(UnwrapNode.java:107)
    from org.truffleruby.cext.UnwrapNode$UnwrapNativeNode.unwrapTaggedObject(UnwrapNode.java:92)
    from org.truffleruby.cext.UnwrapNodeGen$UnwrapNativeNodeGen$Inlined.execute(UnwrapNodeGen.java:377)
    from org.truffleruby.cext.UnwrapNode.longToWrapper(UnwrapNode.java:270)
    from org.truffleruby.cext.UnwrapNodeGen$Inlined.execute(UnwrapNodeGen.java:143)
    from org.truffleruby.cext.ValueWrapperManager$UnwrapperFunction.execute(ValueWrapperManager.java:407)
    from org.truffleruby.cext.UnwrapperFunctionGen$InteropLibraryExports$Cached.execute(UnwrapperFunctionGen.java:123)
    from com.oracle.truffle.llvm.runtime.nodes.func.LLVMDispatchNode$LLVMLookupDispatchForeignNode.doGeneric(LLVMDispatchNode.java:459)
    from com.oracle.truffle.llvm.runtime.nodes.func.LLVMDispatchNode$LLVMLookupDispatchForeignNode.doUnknownType(LLVMDispatchNode.java:487)
    from com.oracle.truffle.llvm.runtime.nodes.func.LLVMDispatchNodeGen$LLVMLookupDispatchForeignNodeGen.executeAndSpecialize(LLVMDispatchNodeGen.java:1664)
    from com.oracle.truffle.llvm.runtime.nodes.func.LLVMDispatchNodeGen$LLVMLookupDispatchForeignNodeGen.execute(LLVMDispatchNodeGen.java:1487)
    from com.oracle.truffle.llvm.runtime.nodes.func.LLVMDispatchNode.doForeignExecutable(LLVMDispatchNode.java:380)
    from com.oracle.truffle.llvm.runtime.nodes.func.LLVMDispatchNodeGen.executeAndSpecialize(LLVMDispatchNodeGen.java:723)
    from com.oracle.truffle.llvm.runtime.nodes.func.LLVMDispatchNodeGen.executeDispatch(LLVMDispatchNodeGen.java:305)
    from com.oracle.truffle.llvm.runtime.nodes.func.LLVMCallNode.doCall(LLVMCallNode.java:82)
    from com.oracle.truffle.llvm.runtime.nodes.func.LLVMCallNodeGen.executeGeneric(LLVMCallNodeGen.java:37)
    from com.oracle.truffle.llvm.runtime.nodes.api.LLVMFrameNullerExpression.doGeneric(LLVMFrameNullerExpression.java:71)
    from com.oracle.truffle.llvm.runtime.nodes.api.LLVMFrameNullerExpressionNodeGen.executeGeneric(LLVMFrameNullerExpressionNodeGen.java:29)
    from com.oracle.truffle.llvm.runtime.nodes.vars.LLVMWriteNodeFactory$LLVMWritePointerNodeGen.execute_generic1(LLVMWriteNodeFactory.java:1370)
    from com.oracle.truffle.llvm.runtime.nodes.vars.LLVMWriteNodeFactory$LLVMWritePointerNodeGen.execute(LLVMWriteNodeFactory.java:1344)
    from com.oracle.truffle.llvm.runtime.nodes.base.LLVMBasicBlockNode$InitializedBlockNode.execute(LLVMBasicBlockNode.java:154)
    from com.oracle.truffle.llvm.runtime.nodes.control.LLVMDispatchBasicBlockNode.dispatchFromBasicBlock(LLVMDispatchBasicBlockNode.java:116)
    from com.oracle.truffle.llvm.runtime.nodes.control.LLVMDispatchBasicBlockNode.doDispatch(LLVMDispatchBasicBlockNode.java:87)
    from com.oracle.truffle.llvm.runtime.nodes.control.LLVMDispatchBasicBlockNodeGen.executeGeneric(LLVMDispatchBasicBlockNodeGen.java:33)
    from com.oracle.truffle.llvm.runtime.nodes.control.LLVMFunctionRootNode.doRun(LLVMFunctionRootNode.java:81)
    from com.oracle.truffle.llvm.runtime.nodes.control.LLVMFunctionRootNodeGen.executeGeneric(LLVMFunctionRootNodeGen.java:34)
    from com.oracle.truffle.llvm.runtime.nodes.func.LLVMFunctionStartNode.execute(LLVMFunctionStartNode.java:102)
/Users/mauro/code/git/mtortonesi/rice/include/rice/rice.hpp:1137:in `call'
    from /Users/mauro/code/git/truffleruby-ws/graal/sdk/mxbuild/darwin-aarch64/GRAALVM_F2FA4FF4E3_JAVA21/graalvm-f2fa4ff4e3-java21-24.0.0-dev/Contents/Home/languages/ruby/lib/truffle/truffle/cext_ruby.rb:42:in `each'
    from test.rb:3:in `<main>'

(Note that this trace belongs to the enum sample of rice, which I tried because I thought it might be easier to run than the callback sample.)