benfred / py-spy

Sampling profiler for Python programs
MIT License
12.16k stars 401 forks source link

Confusing line numbers? #549

Open ArthurConmy opened 1 year ago

ArthurConmy commented 1 year ago

Hi there,

I'm having trouble understanding the line numbers displayed in the flamegraph produced by py-spy. When I run my script (text file here) with the following command:

py-spy record --format=speedscope -o profile.json python gpt2.py

I notice that the line numbers in the resulting flamegraph (displayed as a plaintext JSON file at the following link): appear to be shifted by 1 compared to my actual script. For example, at the 57.9 second mark, calls to .backward(), .step(), .zero_grad() and .empty_cache() are logged at lines 274, 277, 278, and 279 in the flamegraph, but they occur at lines 273, 274, 278, and 279 in my script.

I would appreciate if you could explain this discrepancy and help me understand how to properly interpret the line numbers in the flamegraph.

Thank you!

itamarst commented 1 year ago

I think this is a bug, I'm seeing a similar issue where the reported line number is clearly wrong, py-spy says it's line 91 but it's line 93 actually based on the rest of the stack (Python 3.10, Linux). Line 92 is a blank line so semantically it's off by one.

This is a deadlocked process and py-spy dump, so the issue isn't sampling getting things wrong.

(Austin reports the correct line number.)