jhawthorn / vernier

📏 next generation CRuby profiler
https://vernier.prof/
MIT License
718 stars 15 forks source link

Single thread profiling? #45

Closed mperham closed 4 months ago

mperham commented 7 months ago

Now that Ruby 3.3 supports thread-specific profiling, I would love to see Vernier add support for this so I can show people how to profile a single job in production.

A Sidekiq middleware like this might work:

class JobProfiler
  include Sidekiq::ServerMiddleware

  def call(_worker, job, _queue, &block)
    return yield unless job["profile"]

    jid = job["jid"]
    klass = job["class"]
    Vernier.trace(out: "/tmp/#{klass}-#{jid}.profile.json", mode: :thread, &block)
  end
end
Sidekiq.configure_server do |config|
  config.server_middleware.add JobProfiler
end
$ irb
> SomeJob.set(profile: true).perform_async

You'd just need to add support for mode: :thread or similar.

jhawthorn commented 7 months ago

Hi Mike. Vernier profiles all threads, recording them separately, so I'm not sure what value there is in profiling a specific one, you can filter that in the viewer.

We aren't currently using the rb_profile_thread_frames API as I don't believe it can be used safely without sending a signal, and since we're sending a signal we might as well use the old API.

mperham commented 7 months ago

I was thinking that profiling a single thread allows the user to focus on that one thread. Sidekiq has ~10 different threads running by default so a full profile is quite complex and "noisy" to the viewer.

It may also be less overhead in production if Ruby knows it does not need to collect full samples while the other 9 threads are running. Are there really no benefits?

jhawthorn commented 7 months ago

I'm very keen to make Vernier work as well as possible for Sidekiq users, that was one of the inspirations behind it's design. Definitely open to changes improving that.

I don't think overhead is much of a concern. Because Vernier is based around tracing the GVL, any non-running threads we don't even collect a sample and just increment the weight of the previous one. Even when we do collect a sample, in Vernier that's tiny, just a few integers vs the stack and line numbers Stackprof would record in raw mode.

What difficulty are you seeing viewing multi-threaded Vernier profiles of sidekiq? I'd definitely want to solve that. We have an example in the README (which only uses two threads but it could have recorded any number) and I think it's really great. The firefox profiler, which we've chosen as our viewer, is really designed for the purpose of viewing multithreaded programs and drilling down to the issue that's being investigated.

Image

I think it's better, assuming it's viable, to include all the data we can on the profiling side and filter on the viewer. Say, for example, if you were profiling one job and found that it was stalled waiting on the GVL, wouldn't you want to check the other threads to see who was responsible?

One thing we might be able to use and improve is we have control over which thread(s) are initially selected and initially visible. Do you think that would achieve what you're looking for?

(for reference keys in the profile JSON I would want to set https://github.com/firefox-devtools/profiler/blob/53cd026bffd8230fb3dc300220102d8ccd7bb99c/src/types/profile.js#L911-L918)

mperham commented 7 months ago

I think it's a great idea to collaborate and find rough edges where we can make Vernier profiling jobs easier for all. I think for the majority of Rubyists, that UI can be daunting and/or inexplicable.

I have my weekly Office Hour tomorrow 9am Pacific, I'd extend a special invite to you if you're available to chat.

https://sidekiq.org/support.html

jhawthorn commented 4 months ago

We discussed this and the main incentives for single thread profiling would be to reduce file size and avoid confusing the user. I think the advantages single thread profiling could have will be more easily solved by #56 to avoid user confusion and #47 to even further improve file size.