Open ivoanjo opened 7 years ago
great summary, as noted on the other thread I am also interested in reification being improved/reliable.
probably in a way you mentioned (don't care for having it on by default - backward compat first), there's one catch and that is the reusal of reification for become_java!
so it needs to be done carefully while at the same time sharing as much as possible (there's a third separate mechanism for generating Java proxies - ideally it could also become somehow part of the show and very optimistically work on edge cases such as when a Java sub-class in Ruby.become_java!
).
based on these requirements I see anything else as an optional bonus (-Xreify.classStorage
redundant?)
Ok so I think this is one of two active class reification issues outstanding, and I'll muse on some solutions here.
I suspect the problem in your case is actually a bug in how we handle super within these trampoline methods. As you discovered, the super call is starting its search at the bottom of the class hierarchy again (in your custom class) rather than at the Object level or higher. As a result it hits the super again and blows the stack.
We can fix that issue by ensuring the trampolines are passing in the correct class, I think.
The more complicated issue here is that there are many such methods that when reified may overwrite core class behavior and act oddly. I'm thinking perhaps we should get class reification working again, but only for methods you explicitly say to export.
This may or may not become a default feature, but I think it should (re)built based on the variable reification logic that we use all the time now. Ultimately all these classes should be generated, and if you opt into it there should be a bottom class for each Ruby type so you can see it in profiles, etc.
In the short term...I'm considering changing the reify.classes flag to at least start generating a real subclass (below the specific-width reify.variables classes) so people can get the benefits of real classes until we fix the method mapping.
Oh and to answer the question about trampolining: yes, this is a large part of the reason reification exists: to provide a way for Java interfaces (and plain instance methods) to be implemented such that they'll dynamically dispatch into Ruby and pick up any lazily-defined methods (or method_missing).
Ok, moving right along. I just committed logic that generates all the RubyObject0 etc classes along with their ObjectAllocator0. This cleans up a lot of the reification code around instance variables.
The next step would be generating a specific subclass of those width-specific specialized classes that has the name of the Ruby class in question.
For simple cases where the object descends from Object, this will be no problem. We make sure the width-specific class is available, and then generate a new subclass (and allocator, likely) specific to the Ruby class name.
For the more complicated cases where there are other reified classes between ours and Object, things get a little tricky.
For example, the following case:
class Foo
def initialize
@foo = 1
end
end
class Bar
def initialize
super
@bar = 0
end
end
Bar.new
The Foo class is easy: its Java hierarchy will be `Foo < RubyObject1 < ReifiedRubyObject < RubyObject.
What should Bar's Java hierarchy be?
Bar < Foo < ...
but then we can't reify the @bar
variable and it would have to be in an array table, since we've already done our specialization in this hierarchy.Bar < RubyObject2 < ReifiedRubyObject < RubyObject
. Obviously this would break some usages from Java, since Ruby would think that a Bar is a Foo, but Java would not. This would also make it harder to e.g. profile for all instances of Foo and its subclasses.getVariable1
style methods for @bar
. The down side here is that for all second-level or higher user class (i.e. at least one intervening class before reaching Object) we'd basically duplicate those specific-index methods every time.The current reification logic for instance variables did not have this problem, because it did not enforce any Java-level hierarchy to match the Ruby hierarchy. Both Foo and Bar would simply use the base RubyObject1 or RubyObject2.
I'll ponder this a bit tonight and see if I can think of anything.
There's a fourth option: don't reify instance variables if we're reifying classes. That's basically what we had before, except that reifying variables was always on and interfered with reifying classes.
On the pro side this would simplify "structural" reification of the class name and methods. On the con side this would mean losing the memory and performance characteristics of field-based instance variables in exchange for structural reification.
At this point I'm leaning toward just duplicating the get/set logic in each generated class when there's a hierarchy to be maintained, so each class knows how to get its own variables and all variables get reified properly.
In order to have my example above for Bar < Foo
produce only those two reified classes, Foo must reify first with var0, and Bar must copy its instance variable accessors plus add another for var1. From then on the two classes would evolve additional ivars independently, using an array table to hold them. But the inspectable vars would be reified immediately.
Note also this work may make it possible to re-specialize a given class at a later time if it turns out there's many dynamically-created instance variables. Ideally all classes that aren't changing a lot at runtime would eventually produce tightly-packed Java objects.
Thanks for tackling this and for the great write-up @headius ! :tada: :rocket:
I would like to merge #4648. Can you confirm it works properly for you?
I have not come up with a good way for us to reify both fields and class identity, given the challenges of supporting parent class fields in children, so the patch in #4648 (which disables one type of reification when you enable the other) may be the best option going forward.
The unfortunate down side is that field reification would be useful to have with class identity reification, since it makes it simpler to drill down into object trees in a heap dump. However it's only marginally better than the var table, since none of the original instance variable names are used (instead it's value0, value1, etc). The best option would be to use the actual names from Ruby code, reify the fields, and still maintain the parent/child class structure in the reified classes. I'm not sure how to do that right now.
I'm still alive :) I'm moving so the last few days have been a bit hectic but I'll test and report back ASAP.
Just replied on https://github.com/jruby/jruby/pull/4648#issuecomment-328285522
I've merged in the "new" reify variables logic, which will now generate proper-shaped RubyObject subclasses for any number of statically-inspectable instance variables.
$ jruby -e 'eval "class Foo; def initialize; #{ 100.times.map {|i| "@foo#{i} = 1"}.join("\n")}; end; end"; p JRuby.ref(Foo.new).getClass'
class org.jruby.gen.RubyObject100
This does nothing to fix the combination of reify.classes and reify.variables, nor does it fix reification in the presence of become_java!. The complexities are many:
So there's a lot of questions to answer about reification of variables, class names, hierarchies. But it's moving in the right direction.
Bumping to post 9.2.
Thanks for picking this up! I'm still very interested in the use-Java-performance-tooling-to-peek-at-JRuby angle, so I'm anxiously following new developments :)
Well that is an interesting point. If this feature is focused on presenting named types to the jvm, a lot of the tricky bits above could be ignored. I had hoped to unify all of the different forms of reification, including become_java, but that may be a bit too ambitious given the oddities of the class structure. So if all we really wanted or needed was proper names for the classes, it would not be at all difficult to just generate one more subclass of the specialized type. Basically just support the simplest class reification for heap profiling and other jvm tools.
Yeap, I think being able to use JVM tools easily (or even out of the box), especially in production would be a major value-add.
I have a local patch I'm testing that adds reify.variables.name
to generate the specialized classes with their "real" names:
[] ~/projects/jruby $ jruby -e "module Foo; class Bar; end; end; puts JRuby.ref(Foo::Bar.new).getClass"
org.jruby.gen.RubyObject0
[] ~/projects/jruby $ jruby -Xreify.variables.name -e "module Foo; class Bar; end; end; puts JRuby.ref(Foo::Bar.new).getClass"
org.jruby.gen.Foo.Bar
Woot, it's working!
$ jruby -J-Xrunhprof:depth=0 -Xreify.variables.name -S gem list > /dev/null
Dumping Java heap ... allocation sites ... done.
$ grep ' org.jruby.gen' java.hprof.txt
56 0.24% 83.26% 325440 5803 328520 5858 300000 org.jruby.gen.Gem.Dependency
59 0.22% 83.93% 299248 9336 333840 10417 300000 org.jruby.gen.Gem.Requirement
60 0.22% 84.15% 294472 1268 294472 1268 300000 org.jruby.gen.Gem.Specification
99 0.08% 89.15% 111320 1260 111408 1261 300000 org.jruby.gen.Gem.StubSpecification
142 0.05% 92.03% 71032 1260 71088 1261 300000 org.jruby.gen.Gem.StubSpecification.StubLine
197 0.03% 94.14% 33960 598 33960 598 300000 org.jruby.gen.Gem.Version
226 0.02% 94.77% 25688 630 25688 630 300000 org.jruby.gen.Gem.NameTuple
I've merged in that new property. It doesn't address the problems with become_java!
in the presence of reify.variables
so this is still an open issue, probably to address in 9.3.
The intermediate mode using reify.variables.name
solves the main problem here, in that it provides a way to use JVM tooling to see real classes without breaking on core class overrides.
Running the following script produces the subsequent output in a VisualVM heap dump:
script:
class Foo; end
ary = 10000.times.map { Foo.new }
gets
The remaining problems overriding existing methods from fully reified classes (with methods) remains and will not be fixed for 9.3.
@byteit101 This is an interesting problem you might have some thoughts on, since you worked closely with the reification code for #5270.
Really awesome, thanks for the update! :)
Lots of work has happened in reification since this issue, so I think it's worth punting remaining issues to 9.5.
@byteit101 This might be something you can help with, since you have had your fingers in the other reification logic for real Java subclasses.
Here's another interesting trigger, not sure if the issue is exactly the same:
class A;
attr_reader :id
end
puts A.new.object_id.inspect
❯ JRUBY_OPTS="-Xreify.classes=true" ./vendor/jruby/bin/jruby -v
jruby 9.4.6.0 (3.1.4) 2024-02-20 576fab2c51 OpenJDK 64-Bit Server VM 17.0.4.1+1 on 17.0.4.1+1 +jit [arm64-darwin]
❯ JRUBY_OPTS="-Xreify.classes=false" ./vendor/jruby/bin/jruby /tmp/test2.rb
4000
❯ JRUBY_OPTS="-Xreify.classes=true" ./vendor/jruby/bin/jruby /tmp/test2.rb
nil
❯ JRUBY_OPTS="-Xreify.variables.name" ./vendor/jruby/bin/jruby /tmp/test2.rb
4000
We've stumbled upon this in the wild after a user enabled reify.classes by mistake in production.
Environment
jruby 9.1.10.0 (2.3.3) 2017-05-25 b09c48a Java HotSpot(TM) 64-Bit Server VM 25.131-b11 on 1.8.0_131-b11 [linux-x86_64]
Linux maruchan 4.11.0-041100-generic #201705041534 SMP Thu May 4 19:36:05 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Ubuntu 17.04
Expected Behavior
When investigating failures while using the
-Xreify.classes
option and after fixing the issue in #4642, it turns out that then I got bit by #3793.I spent a few minutes looking at why
causes an infinite recursion:
In a nutshell, this seems to happen because when calling
super
, we eventually reach theRubyKernel
, which then calls back into the object itself, assuming that the onlyto_s
defined on a ruby object instance is the method inRubyBasicObject
.This assumption is true... except when we are using
-Xreify.classes
. This is because theReificator
actually adds ruby methods to the class it generates, with a "trampoline" that allows Java code to directly call methods on ruby classes.Of course, when the class itself overrides methods from
BasicObject
and callssuper
, thenRubyKernel
will instead call the overridden version of the method, rather than the one it intended to call. (E.g. conceptually, we'd want RubyKernel calling those methods withINVOKESPECIAL
).The problem above can thus be trivially fixed by just deleting https://github.com/jruby/jruby/blob/9.1.10.0/core/src/main/java/org/jruby/RubyClass.java#L1547-L1726 .
I'm opening this issue to mostly sum up what I've discovered and to pose a question: Is this behavior of having the generated "trampoline methods" still a thing JRuby wants to have?
I ask because I see
-Xreify.classes
being useful for two purposes:To have ruby instances that correspond to their class, rather than
RubyObject
and friends. This is invaluable when debugging memory issues as otherwise it's very hard to see what's going on in VisualVM and friendsTo have a Java class that contains the methods of the ruby class, allowing them to be called by Java code
I believe that supporting just 1) without 2) would be an extremely valuable tool for debugging memory problems with JRuby applications, as otherwise it's very complex to navigate heap dumps.
Fixing
-Xreify.classes
for this use case can just be deleting the offending code, or alternatively adding a-Xreify.classStorage
that enables reification but does not generate the "trampoline methods" would be fantastic. Would any of these be solutions you'd consider accepting?Otherwise, the alternative as I see it would be to probably duplicate or convert the java methods on
RubyBasicObject
into static methods, so they could be called directly, rather with dynamic dispatch, but that would probably be a more complex endeavor.