ruby / debug

Debugging functionality for Ruby
BSD 2-Clause "Simplified" License
1.13k stars 126 forks source link

Performance issue when stepping through a deep callstack #760

Open marianosimone opened 2 years ago

marianosimone commented 2 years ago

Your environment

Describe the bug When stepping through with a deep callstack, the debugger becomes slower with each step.

To Reproduce In a codebase which has a method that's ~150 calls deep in the stack, add a breakpoint, debug, and step through.

Expected behavior It wouldn't be slow

Additional context We are facing this issue at Stripe, and we tracked down the problem to: https://github.com/ruby/debug/blob/296daa6742e114395a4ce336b8721d48a6e5e52f/ext/debug/debug.c#L92-L98

This seems to be generating the full backtrace through rb_make_backtrace, just to then take the size. rb_make_backtrace (source) generates an array of strings each time it's called. Those strings end up not being used, and after a couple of iterations, you end up with a big memory footprint, which causes the GC to collect them, significantly slowing down the debugger.

Luckily, it looks like you are already aware that this is inefficient (given the // TODO: more efficient API) :)

Looking into alternatives, I came up with a couple of options (but open to any other ideas):

  1. Use rb_threadptr_backtrace_object to skip the formatting of the backtrace, saving some extra memory (but still generating more than necessary)
  2. debase solves this with a different approach: https://github.com/ruby-debug/debase/blob/5780803819f16190ce50ddfdad5775acf8f95fd1/ext/hacks.h#L26

Both of them, though, require the inclusion of *_core.h, which is not currently possible (debase solves it by using its own version of ruby_core_source)

st0012 commented 2 years ago

FYI, I think @WillHalto is already addressing this in #743 and #746

WillHalto commented 2 years ago

Yeah this issue makes stepping and finishing quite slow for large callstacks.

I think those ^ PRs should help quite a bit, with both changes together I measured a ~99% improvement in next performance https://github.com/ruby/debug/pull/743#issuecomment-1250131651 for an example case. Similar improvement for our internal use case in a large rails monolith.

However one thing to note is that the optimization in https://github.com/ruby/debug/pull/746 won't be applicable for ruby 2.7, as rb_profile_frames will skip C frames for ruby < 3.0 so I don't think we can use it for all versions.

marianosimone commented 2 years ago

Unfortunately, #473 doesn't seem to have a significant impact in the examples I've tried. It does reduce time/memory consumption, but if the callstack is mostly small methods (so the ration of :return to :line high), it doesn't make a difference.

Given that #746 is for ruby > 3.0, we can't take advantage of it, either.

edit: That being said, even if it doesn't make a big impact in the cases I've tried, I think these optimizations are great and DO cut down on a lot of unnecessary work :)

WillHalto commented 2 years ago

Gotcha, yeah #743 makes the most difference when the ratio of :line to :return is flipped.