nvdv / vprof

Visual profiler for Python
BSD 2-Clause "Simplified" License
3.95k stars 154 forks source link

Heatmap not always accumulate time spent on the right line #116

Open JulienPalard opened 11 months ago

JulienPalard commented 11 months ago
Description

The heatmap calculation works by assigning the elapsed time to the line that previously was executed.

This may be wrong with function calls.

How to reproduce

Try vprof -c h with:

from time import sleep

def i_am_fast():
    return 5

def i_am_slow():
    return sleep(i_am_fast())

i_am_slow()
Actual results

The return 5 is highlighted while it costs ~0s.

Expected results

The return sleep(i_am_fast()) should be highlighted instead as it costs ~5s.

Analysis

There's a nice way to see it:

class Trace:
    def __enter__(self):
        self.original_trace_function = sys.gettrace()
        sys.settrace(self.record_line)
        self.begin = perf_counter()
        print("Start counting at 0")
        return self

    @property
    def elapsed(self):
        return perf_counter() - self.begin

    def __exit__(self, exc_type, exc_value, exc_tb):
        sys.settrace(self.original_trace_function)
        print(f"Exiting after {self.elapsed:.3f}s")

    def record_line(self, frame, event, arg):
        if event == 'line':
            print(f"Executing {frame.f_code.co_filename.split('/')[-1]}:{frame.f_lineno} at start+{self.elapsed:.3f}s")
        if event == 'return':
            print(f"Returning from {frame.f_code.co_filename.split('/')[-1]}:{frame.f_lineno} at start+{self.elapsed:.3f}s")
        return self.record_line

with Trace():
    i_am_slow()

Giving:

Start counting at 0
Executing trace.py:10 at start+0.000s  # the sleep(i_am_fast()) line, calling i_am_fast
Executing trace.py:6 at start+0.000s  # the return 5 line
Returning from trace.py:6 at start+0.000s  # the return 5 returning, note it's fast as excepted to return 5  
Returning from trace.py:10 at start+5.000s # back at the sleep(i_am_fast()) line, but returning AFTER sleeping.
Executing trace.py:26 at start+5.001s That's the __exit__ call.
Exiting after 5.001s

The current vprof interpretation of this trace is to assign the 5s to the trace.py:6 line when receiving the trace.py:26 event, overlooking the time spent between the two (untraced) returns.

Version and platform

Debian Trixie, vprof 0.38.