Open mstange opened 7 years ago
For comparing markers, I'd like to see statistical information about the markers reported (probably on a new pane) such as for duration of GCMinor events I'd like to see min, 25%, mean, 75%, 90%, max, for example. When comparing profiles I'd like to see box-and-whisker diagrams for these stats, possibly run t-tests or other statistical tests and so on. I can help with this.
Sounds like we need:
I feel this can be made possible by ideas we were having about improving the header timeline by only showing one thread/activity row by default. We'd need to be able to:
This feature is kind of intense so let me know if I seem to be misunderstanding anything 😅
Also, re: call stack comparison, would it be helpful/possible to display the two profiles in two columns like a diff tool, connecting/highlighting the parts that are different and using some green success color to indicate the side that's more performant?
This should also use #448
I feel this can be made possible by ideas we were having about improving the header timeline by only showing one thread/activity row by default.
My comment was for showing one thread for the Timeline panel, not the header. So I think we need a mechanism to select which threads we are interested in comparing.
Auto-select the most interesting thread of the profile and display only its events
I'm not sure how this would work or if it's feasible.
Also, re: call stack comparison, would it be helpful/possible to display the two profiles in two columns like a diff tool, connecting/highlighting the parts that are different and using some green success color to indicate the side that's more performant?
The problem here is that the stacks won't always line up. A top functions view would be much easier to do this with, and would be very nice. The call tree is dependent on the order that code is called. Often performance comparisons are about changing that order to make certain things faster.
Possibly what I would like to do would be to use the call tree and focus in on a part of the call tree, then do that for a second profile. Afterwards I would like to compare timing on top functions. I could see a diff-like approach for that.
@mstange had some thoughts on computing a diff call tree. I'm not sure if that's written down somewhere.
@violasong did I miss anything?
Here are some mockups - let me know if you have any feedback! Also, do any of the other panels need design work for this?
Importing a profile into another profile:
Profile comparison with difference tree:
Need to file break out issues for this.
Suggest starting with the Call Tree comparison view and waiting on the "top thing"
Not planned in 2017 in the current scope.
Phase 1 should include:
fitzgen had a link from an older bug that may be worth reading up on:
Note that go's builtin profiling tools support this and it seems pretty neat: https://github.com/bradfitz/talk-yapc-asia-2015/blob/master/talk.md
Florian did an experimental script to help comparing two profiles, I tweaked it to accomodate my usecase and I think that's a very good starting point that help prototype how we could compare stacks and frames!
Here is my fork, it is run like this:
$OBJDIR/dist/bin/run-mozilla.sh $OBJDIR/dist/bin/xpcshell compare-profile-alex.js p1.profile p2.profile
You can omit run-mozilla.sh thing if you are not on linux. The two profiles as arguments have to be Talos profile fetched directly from Talos, like this talos zip file.
This script does three things:
All the credits go to florian, his original script supports profile fetched from perf-html server rather than the one from talos and accept a folder as argument. It will filter the slowest and the fastest profile and compare them. It also contains hardcoded markers strings in it you will want to modify.
This is amazing! What is the difference between the pseudo-threads "First profile" and "Only first"? Are stacks for the two profiles being kept separate in the respective pseudo-threads?
TBH, I've not reviewed what is being done exactly, but here is Florian's words about that:
It creates a 3rd and 4th profile showing samples that only appears in the first profile ou the second.
First profile
is the GeckoMain thread of the first profile passed as command line argument.
Second profile
same, but second.
Only first
is showing only the samples that are in the first but not in the second.
Only second
is showing only the samples that are in the second but not in the first.
That would be great to have a quick feedback from someone with strong knowledges about profile data structure to validate what is being done here. Also, ideas on what we could do next would be welcomed. I imagine Me and Florian would be happy to experiment more ideas from this script, but anyone is welcome to hack on it as well.
@ochameau great, more people experimenting with Florian's script was what I was hoping for as a lot of the processing be prototyped with some script processing.
I tweaked it to accomodate my usecase and I think that's a very good starting point that help prototype how we could compare stacks and frames!
Florian did not have success with his past tests to answer his question as the result was too noisy and had too little overlap. Where you able to find your answer in the generated profile?
Where you able to find your answer in the generated profile?
I used that on profiles I already analysed manually, but it was great to see that it immediately put in front of me what I was suspecting to be different. When comparing manually it is always hard to know if the thing you think is different isn't hidden somewhere else in the calltree or flame chart, especially when some frames or calls are cut in many pieces. With this simple diff, you are much more confident ! And I'm especially confident as it seems to align to manual analysis. Confirming that both me and the script are most likely correct ;)
Unfortunately I didn't have the time to look at Florian's script yet. But I'd like to brain dump my mind before my PTOs.
So here is what I had in mind:
From @bgrins, this is a good use case: https://bugzilla.mozilla.org/show_bug.cgi?id=1505944#c0
From @ochameau, this is a diff he got from a modified Florian's script: https://perfht.ml/2UAR9X2
This is the base profile: https://perfht.ml/2UFTZtN
and this is the profile for the regression about a function called observeActivity
: https://perfht.ml/2UCl8xI
(possibly not the exact same ones as the ones used for the diff)
This works by going to https://perf-html.io/compare and inputting the 2 URLs in the 2 input fields. This UX isn't perfect obviously.
Here are possible next steps I can see:
@julienw since this landed, should we close the issue?
I don't think so, cf the 2 latest comments above :-) This is like a meta bug for more stuff around this topic. Until we plan to spend time on this I don't think this is worth filing all the bugs.. but maybe we could file the ones that contributors could pick. I'll look at it!
Sometimes you want to measure the performance impact of a change. You'll usually get a profile from before the change and a profile from after the change, and then you want some way of comparing the two.
At the moment, the only way of doing that is to load the profiles in two different tabs and to repeatedly switch between the two. That's not a great experience.
It would be nice to be able to select two profiles to compare, maybe pick one thread from each of them, and then have a few ways of comparing them:
Comparing profiles is a big subject and we'll probably need to experiment with a bunch of different views until we find something that works.
┆Issue is synchronized with this Jira Task