lucko / spark-viewer

Web frontend for spark.
https://spark.lucko.me/
GNU General Public License v3.0
84 stars 28 forks source link

Option to show time taken for each method, regardless of call trace #31

Closed Njol closed 2 years ago

Njol commented 2 years ago

It's currently very hard to find performance issues of methods that are called often from multiple places. Say if a method a is called from 100 different other methods, and takes just 0.2% of total time each, it will be very hard to find despite being responsible for 20% of the total time.

Thus I suggest an alternative view where methods are listed without any call traces, and their total sum of time taken is displayed (not just self time, but total time).

This could be calculated in the browser, so no change to the plugin or data format is required.

The implementation should be relatively straight-forward. Here's what it could look like (in pseudocode, and assuming traces are stored flat, not as a tree):

time_taken = new map<method, time>
for trace in all_traces:
  seen_methods = new set<method>
  for method in trace:
    if !seen_methods.contains(method):
      seen_methods.add(method)
      time_taken[method] += trace.time
display(time_taken)
lucko commented 2 years ago

Merged in (see above), thanks for the suggestion!

Njol commented 2 years ago

Wow that was extremely fast, thank you!

Njol commented 2 years ago

@lucko The flat view does not handle recursion properly. If a method occurs twice in a call trace, it will be counted twice, leading to recursive methods reporting inflated times. Only the top-most occurrence of a method should be counted (or bottom-most?).

Here's a sample profile I want to analyze: https://spark.lucko.me/NUP1Eq2Jbx Note that Objects.equals is at 63% but if you expand it a few times, it starts repeating.

lucko commented 2 years ago

Hmm, tricky one.

Knowing the self-time of recursive calls is useful but as you say, it inflates the total time.

I've added a fix here 6948a3c1339b39814b53b48d06628ad7989c325c - also moved the preprocessing into a webworker which should improve the load time. :)