ivoanjo / gvl-tracing

Get a timeline view of Global VM Lock usage in your Ruby app
MIT License
114 stars 7 forks source link

Support for 3.3 and N:M threads #18

Closed etiennebarrie closed 6 months ago

etiennebarrie commented 7 months ago

Fix #17 Paired with @byroot

This PR adds support for Ruby 3.3 by using the new rb_internal_thread_specific_get which allows accessing data specific to a Ruby thread without requiring the GVL to be locked. It also handles N:M threads automatically.

More info here: https://github.com/ruby/ruby/commit/352a885a0f1a4d4576c686301ee71ea887a345e5

In Ruby 3.3, the thread events don't necessarily run from the thread they correspond to, so the thread corresponding to the event is sent in event_data, and now GT_LOCAL_STATE needs to take the thread too.

To handle GC, the thread specific data is wrapped into a TypedData struct and stored in the Ruby thread as a instance variable. This requires getting the GVL, so we have an allocate flag on GT_LOCAL_STATE too, which is false when we're not running from the thread and can't allocate. The hook only runs from its corresponding thread for the RUBY_INTERNAL_THREAD_EVENT_STARTED and RUBY_INTERNAL_THREAD_EVENT_RESUMED events, so we can allocate there.

If the local state hasn't been initialized yet, we just return early. This means we would skipping events from already existing threads that haven't been resumed yet. To avoid this, we initialize all threads when tracing starts. We do this from Ruby, which ensures we have the GVL. We added a spec for that, which passes on main on 3.2, but not on 3.3.

To support N:M threads (and differentiate events from different threads but the same native thread), instead of returning the native thread id, we know return the address of the Ruby thread in tid (on Ruby 3.3).

byroot commented 7 months ago

FYI @jpcamara

etiennebarrie commented 7 months ago

We're not cleaning up the thread specific data. From the Ruby commit:

Note that you should not forget to clean up the set data.

This only applies if the thread hasn't been GC'd yet. When we stop tracing, we should clean up behind ourselves.

Another thing is that the thread name handling is still using the native thread id: https://github.com/etiennebarrie/gvl-tracing/blob/7b484b979f4322bf5057e9edda2bd2163497f510/lib/gvl-tracing.rb#L75

jpcamara commented 7 months ago

FYI @jpcamara

Thanks @byroot, should be a helpful learning experience going through the struct code 👍🏼 . I've found a couple issues trying it out - i'll post some review comments

etiennebarrie commented 6 months ago

I've fixed more indentation crap I had done (4 tabs), and I've also indented the ifdefs inside methods like you had previously done. And also the warnings under 3.2. The diff is annoying but looking without whitespace we can see the last few changes I've done: https://github.com/ivoanjo/gvl-tracing/compare/4a261ed9920e3d486a43be93ba56b3cbde1d3520..7c83ec6f406ee98da279bb0a650cb33db68faa66?w=1

At this point, the traces from the examples look similar enough between 3.2 and 3.3 that I feel like it's mergeable. Feel free to merge or take over the branch if you want to add more before merging.

ivoanjo commented 6 months ago

Thanks! This looks great. I still want to do a bit of manual testing/experimentation before putting out a release, but I'll go ahead press the magic button now :)

ivoanjo commented 5 months ago

Heeey! Has it already been more than a month? :sweat_smile:

I've just released this change in gvl-tracing 1.5.0. I also took a stab at fixing thread names (main gory details in cafbeb7ff6383f6892947e7213542c09a8f155cc), let me know what you think ;)

casperisfine commented 5 months ago

let me know what you think

I'm not too sure about the Ractor compatibility of this, but I'm no expert on them. Other than that, LGTM.