plasma-umass / scalene

Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals
Apache License 2.0
12.22k stars 399 forks source link

Improves magic handling #844

Closed emeryberger closed 3 months ago

emeryberger commented 3 months ago

Fixes https://github.com/plasma-umass/scalene/issues/843 by providing more robust magic detection and removal.

D-VR commented 3 months ago

trying out this branch with the previous test environment leads to consistent crashes using the below command (although it also succeeded once).

scalene --profile-all --profile-exclude threading.py,dataclasses.py --stacks --cli --cpu --memory --json --outfile test.json --cpu-sampling-rate 0.001 test.py --- -h

Below you can see the two crash types I've encountered, the key 2785 being more common:

scalene --profile-all --profile-exclude threading.py,dataclasses.py --stacks --cli --cpu --memory --json --outfile test.json --cpu-sampling-rate 0.001 test.py --- -h
<cmd output>
Scalene: An exception of type KeyError occurred. Arguments:
(2785,)
Traceback (most recent call last):
  File "user/venv/lib/python3.12/site-packages/scalene/scalene_profiler.py", line 2153, in run_profiler
    exit_status = profiler.profile_code(
                  ^^^^^^^^^^^^^^^^^^^^^^
  File "user/venv/lib/python3.12/site-packages/scalene/scalene_profiler.py", line 1816, in profile_code
    did_output = Scalene.output_profile(left)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "user/venv/lib/python3.12/site-packages/scalene/scalene_profiler.py", line 871, in output_profile
    json_output = Scalene.__json.output_profiles(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "user/venv/lib/python3.12/site-packages/scalene/scalene_json.py", line 513, in output_profiles
    profile_line["start_region_line"] = enclosing_regions[
                                        ^^^^^^^^^^^^^^^^^^
KeyError: 2785
scalene --profile-all --profile-exclude threading.py,dataclasses.py --stacks --cli --cpu --memory --json --outfile test.json --cpu-sampling-rate 0.001 test.py --- -h
<cmd output>
Scalene: An exception of type SyntaxError occurred. Arguments:
('invalid syntax', ('<unknown>', 1100, 3, '  def _is_in_control_flow(self, op):\n', 1100, 6))
Traceback (most recent call last):
  File "user/venv/lib/python3.12/site-packages/scalene/scalene_profiler.py", line 2153, in run_profiler
    exit_status = profiler.profile_code(
                  ^^^^^^^^^^^^^^^^^^^^^^
  File "user/venv/lib/python3.12/site-packages/scalene/scalene_profiler.py", line 1816, in profile_code
    did_output = Scalene.output_profile(left)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "user/venv/lib/python3.12/site-packages/scalene/scalene_profiler.py", line 871, in output_profile
    json_output = Scalene.__json.output_profiles(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "user/venv/lib/python3.12/site-packages/scalene/scalene_json.py", line 486, in output_profiles
    enclosing_regions = ScaleneAnalysis.find_regions(code_str)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "user/venv/lib/python3.12/site-packages/scalene/scalene_analysis.py", line 110, in find_regions
    tree = ast.parse(src)
           ^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/ast.py", line 52, in parse
    return compile(source, filename, mode, flags,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<unknown>", line 1100
    def _is_in_control_flow(self, op):
    ^^^
SyntaxError: invalid syntax

removing the entire function leads to no crashes.

emeryberger commented 3 months ago

Trying a different approach: https://github.com/plasma-umass/scalene/pull/847.