oracle / truffleruby

A high performance implementation of the Ruby programming language, built on GraalVM.
https://www.graalvm.org/ruby/
Other
2.98k stars 180 forks source link

Allow splitting of Truffle::CExt.set_mark_list_on_object. #3541

Closed nirvdrum closed 2 months ago

nirvdrum commented 2 months ago

While looking at performance of a production Rails application, I saw a lot of splitting attributed to the wrappers that TruffleRuby creates for rb_define_method. Ultimately, I think this boils down to WriteObjectFieldNode, which is used by set_mark_list_on_object, reporting polymorphism. I had traced the problematic node back to a usage of NIO::Selector within Puma, where an object is apparently shared across threads causing the WriteObjectFieldNode instance to become polymorphic.

I do not have a simple reproduction. I tried to use the nio4r example code to induce the behavior, but to adequately handle the multi-threaded part I found I was basically rewriting Puma's reactors. Instead, I deployed this change out to my production environment and observed that the repeated splitting of simple methods like Integer#== ceased.

eregon commented 2 months ago

The call chain is

It feels suboptimal to store the mark_list as a DynamicObject property, because I guess we can see many objects with different classes there, but maybe they have few different Shapes? (I need to check) We used to have DataHolder which could have been a good candidate to store that list, but that's gone since d8d3daf8e2b5840c36c9122cee82656a5fcaaa28 and replaced by storing that state directly in a calloc'd struct RTypedData* to improve performance and simplify.

Maybe we could store it on the ValueWrapper? I think that should be fine since this a RubyDynamicObject so there is a guaranteed 1-1 mapping between both. OTOH it would make all wrappers bigger when only rdata/rtypedata and NativeArrayStorage arrays need this.

Or maybe we could use some special RubyDynamicObject subclass for RData/RTypedData and then we could store it there as a plain Java field. But we need to handle RArray too somehow. From looking at rb_data_object_wrap/rb_data_typed_object_wrap in cext.rb they use __layout_allocate__ which is not affected by rb_define_alloc_func, so IOW we should have total control over the representation there and be able to use a specific RubyDynamicObject subclass.

One other thing we could do is use a variant of WriteObjectFieldNode which doesn't report polymorphism, but it's kind of hiding the problem we have polymorphism and ideally we wouldn't. Or stop reporting polymorphism in WriteObjectFieldNode in general, but IIRC that hurts performance on some benchmarks at least.

nirvdrum commented 2 months ago

I took a look at the generated CExtNodesFactory file before and after this PR and the file contents do not change. The difference I was seeing must be related to something else. The only deployment difference was the addition of @ReportPolymorphism, so I'll have to try to figure out what the difference was. Fortunately, I can roll back to previous container images -- it's just a slow process.