Closed ToomeyDamien closed 1 year ago
Something funky is going on, for sure, can you try sourcing stackprof from Github, does that resolve it?
I've tried using the current stackprof master 6bbabf1 and had similar results. No app code is included in the flamegraph.
Gem version: 3.0.0 Rails version: 7.0.3.1 Ruby version: 3.1.2p20
Same here when running under Passenger.
Same here
Ruby 3.0.3 Rails 6.1.6.1 rack-mini-profiler 3.0.0 stackprof 0.2.20 (released July 26 @SamSaffron so I don't think this is an out of date dependency issue)
After taking a quick look, it appears the problem is indeed within StackProf. The sampling mode :wall
doesn't appear to understand the changes around threads within Ruby >= 3.0.
The :wall
sample method happens to be the default for both rack-mini-profiler and StackProf gems.
A quick workaround is to provide a different sample mode by using the flamegraph_mode
query param. Here's an example, add this to your url:
?pp=flamegraph&flamegraph_mode=object
The available sample modes for StackProf are: cpu, object, custom and wall (wall is the one not currently working correctly).
I believe the object
sample mode is the most thorough but also the slowest sampling technique. In my experiences, use the object sample mode and help yourself to a better flamegraph anyways!
The flamegraph_mode query param doesn't seem to be working
In order to make it work I used this https://github.com/jlfwong/speedscope/wiki/Importing-from-stackprof-(ruby)
I'm still seeing this issue after several months and updating everything. @AnomalousBit's workaround makes the flamegraph show up properly but doesn't solve the problem as object mode doesn't show me where rendering time is being spent.
When this was happening in December my versions were:
It's still happening after making sure everything is on the latest versions:
In case it's relevant I'm on a 2021 MacBook Pro with an M1 Pro chip running macOS Ventura 13.1 (22C65)
After doing some more digging I don't think it's a Ruby 3 thing as I can't reproduce it when I create a new app (unless I have a different issue to the OP).
What I'm seeing is 2221ms response times in my app but a flamegraph that's only ~40ms long...
22:06:33 web.1 | Completed 200 OK in 2221ms (Views: 2214.9ms | ActiveRecord: 2.2ms | Allocations: 18389)
To try and narrow it down I created a test app that's structurally almost identical to my production app but the flamegraphs display perfectly. Here's my working test app:
rails new rmp-graph-test --css=tailwind --database=postgresql --javascript=esbuild
cd rmp-graph-test
echo gem \"rack-mini-profiler\" >> Gemfile
echo gem \"stackprof\" >> Gemfile
bundle install
rails g scaffold post body:string author:string
open app/controllers/posts_controller.rb # optional, add `sleep 2` to #index action to mimic slow views
rails db:create
rails db:migrate
bin/dev
open "http://localhost:3000/posts?pp=flamegraph"
I then went through everything I could think of to compare the two apps to try and figure out the difference, either confirming it's the same or making it the same in the test app and then rebooting it:
Gemfile
and Gemfile.lock
)config/puma.rb
)config/application.rb
, config/boot.rb
and config.ru
)bin/dev
and Procfile.dev
)config/environment.rb
, config/environments/development.rb
and config/environments/production.rb
)config/initializers/mini_profiler.rb
but all others were identical too, except for config/initializers/i18n.rb
which I left out of my test app)Am I missing anything in that list? If not, what can I check next to try and narrow it down?
This in no way fixes the current issue but as another route for you and others trying to track down a performance problem, you can use the Benchmark module. https://ruby-doc.org/stdlib-2.7.1/libdoc/benchmark/rdoc/Benchmark.html
Wrap some of your code sections in puts Benchmark.measure{ ... }
blocks and it will output to the console or logs how long it took to execute. You can also nest and label each of these blocks to better your understanding. I know it's a bit more manual and doesn't produce a nice graph, but it might get you where you are going until we can hopefully get this fixed.
Some places I've worked would benefit from having specific method call times logged for good measure anyways, this would be pretty trivial using the Benchmark module. Feed these performance metrics into Grafana and enjoy your free New Relic service. 😜
Good luck!
@matt17r
What I'm seeing is 2221ms response times in my app but a flamegraph that's only ~40ms long...
I think your flamegraph is missing two seconds of your response due to your sleep(2)
call in posts_controller.rb
. My best guess is that the sleep()
(source: https://github.com/ruby/ruby/blob/50c6cabadca44b7b034eae5dcc8017154a2858bd/process.c#L5205 ) is literally putting the thread to sleep, making the thread unavailable for statistics collection by the stackprof gem.
The code below takes about 4.5 seconds on my machine and it shows up perfectly in the flamegraph using the example rmp-graph-test app you provided. Try something like this instead of using sleep()
:
values = []
100000000.times { |i| values << i }
Also, it appears the default :wall mode is working fine for me in the latest versions of ruby / rack-mini-profiler for anyone interested. Maintainers have suggested that stackprof version 0.2.23 has potentially fixed the flamegraph / threading problem mentioned throughout this thread. This issue may be closed if others can confirm?
Thanks for the suggestions and help @AnomalousBit.
I finally solved my performance problem on the weekend after taking Nate Berkopec's advice to use rbspy. It's not as convenient as RMP but I still can't get RMP to work. I posted a little write up on Sunday night.
Regarding closing this issue, I had already upgraded from stackprof 0.2.22 to 0.2.23 before opening issue 193 on the stackprof repository and I am still seeing the issue in the actual app I was having performance problems with.
On the other hand, I cannot reproduce the issue in a brand new app, despite my best efforts to make the two apps identical.
So perhaps stackprof 0.2.23 does solve this issue and my particular flamegraph problem was a coincidence or a red herring???
If stackprof 0.2.23 solves it for others I've got no objection to this issue being closed. People can always open a new issue and reference this one if they discover new information while using 0.2.23 or greater. I wish it weren't so but I'm unlikely to be motivated to spend more time investigating this now I've solved my actual performance problem.
P.S. For what it's worth, the sleep 2
in my test app was showing up in the RMP flamegraph, at least in wall mode:
The ~2,180ms that were missing from my RMP flamegraph finally showed up in my rbspy flamegraph as repeated unnecessary calls to create a password for existing users:
Confirmed addressed in latest stackprof versions. If you are seeing this, upgrade stackprof.
I'm running into this with latest RMP 3.1.1 and stackprof 0.2.25. All profiles are captured using pp=async-flamegraph. How can I debug it further?
The flamegraph seems to be incorrect/incomplete when using Ruby 3.0.0 and above.
Steps to reproduce
Gemfile
flamegraph
linkLet me know if I have missed something!
By the way, thanks for the great repo :)