"frame times may be negatively affected" is not quite the right message

Hixie commented 2 years ago

We have these checkboxes that add more instrumentation: https://github.com/flutter/devtools/blob/master/packages/devtools_app/lib/src/performance/performance_screen.dart#L389

We have a message that says that "frame times may be negatively affected" when these are on, which is true. However, I'm not sure that fully represents the issue with these flags. People might see that and think, ok, I'll ignore the numbers, but the relative sizes of work loads is still a guide as to what I should work on... which is wrong. It's possible for a widget to take 5ms with these flags off, and be the main problem that causes jank, but for these flags to cause other widgets to take 5ms as well while the problematic widget continues to only take 5ms, and now the source of the jank has been obscured.

The problem is that the frame times can be affected in non-representative ways.

Anyway I tried to submit a PR to fix this but I could not come up with wording that was accurate and comprehensible. So I'm filing this bug instead.

cc @InMatrix

InMatrix commented 2 years ago

It's hard to come up with an appropriate description without some ground-truth data. I think we need to measure the impact of those flags on frame times with some test cases on common hardware and then come up with reasonable wording.

Another direction we could consider is to only allow those flags in the debug mode if we don't have confidence in the frame times at all even their relative changes. Furthermore, we might consider measures of workload sizes other than time (e.g., # of widgets rebuilt, # of layouts performed, # of specific Skia operations, etc).

Hixie commented 2 years ago

I think we need to measure the impact of those flags on frame times with some test cases on common hardware and then come up with reasonable wording.

It basically makes the times have no relationship to the "real" times. Every widget and render object gets a constant additional time, regardless of how long it normally takes, and there's hundreds of them so the net effect is that the whole frame time becomes extended by a factor of N where N is the number of widgets/render objects, with that extra time spread uniformly (whereas performance issues are non-uniformly distributed in the original "real" data).

I've argued that we should disable these in profile builds before, but others have argued they're useful there anyway... cc @goderbauer @dnfield

dnfield commented 2 years ago

These flags can help when there are big problems, like discovering that text layout is taking up the vast majority of your frame layout time.

I'm not really sure how any of these flags could take 5ms and not add a similar amount of time to the problematic widget - it'd be helpful if we had examples where that happens. I do know when you're looking for very small and repeated things, they're not helpful at all. @jonahwilliams was saying recently that even the stock tracing we have that's not behind these flags added a lot of noise to CPU profiling when looking for dominators during build.

Hixie commented 2 years ago

The 5ms thing was just an exaggerated example, not intended to be a concrete case, sorry for the confusion.

InMatrix commented 2 years ago

Having some concrete examples and data to discuss would be most helpful at least for me to offer any suggestions from a UX standpoint. I have some perf examples from past user studies that might be useful: https://github.com/InMatrix/perf_diagnosis_demo.

jonahwilliams commented 2 years ago

calls to the timeline can easily dominate in any sort of profile. I can see how timeline calls are useful if you're looking at the timeline view itself, but if you use the (very nice I might add) cpu flamegraph it is completely worthless because everything bottoms out in timeline.

This situation could be significantly improved by improving the performance of the timeline, IIRC this was also a complaint from folks working on fuchsia on low end devices.

Also there is a sort of alternative use for the debug build widgets functionality. Rather than looking at the execution time, folks were using it to determined what widgets were built in a frame to attempt to diagnose overbuilding.

Hixie commented 2 years ago

Also there is a sort of alternative use for the debug build widgets functionality. Rather than looking at the execution time, folks were using it to determined what widgets were built in a frame to attempt to diagnose overbuilding.

That's the main use case. That, and figuring out where you have particularly deep trees and such.

jonahwilliams commented 2 years ago

Yeah, I think in that case you would probably do better building a different mechanism of reporting. You could build up a list of the same data that is being sent to the timeline today and push that to devtools (requiring some sort of new view) and it would almost certainly be faster. I bet you could even create a relatively compact encoding that didn't cost as much as runtimeType.ToString().

The disadvantage is you dont know the exact timing, but I think we've established that with these flags on the timing is useless anyways.

Hixie commented 2 years ago

Maybe. There's the trade-off that now you have two systems, which costs time in building and maintaining both, as well as explaining them, etc. And at the end of the day, it's not clear that you've really won much -- after all, if we could make this faster, we should do so in the Timeline code too, where it actually matters!

That's rather outside the scope of this issue, though, which is about making the text clearer. :-)

flutter / devtools

"frame times may be negatively affected" is not quite the right message #3530