plasma-umass / scalene

Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals
Apache License 2.0
11.56k stars 388 forks source link

Scalene missing files when profiling. #384

Open JCrane512 opened 2 years ago

JCrane512 commented 2 years ago

Describe the bug I have a directory structure mainDir/subDir1/entry_point.py

I have another directory mainDir/subDir2/program_file.py

I am running scalene on the CLI with the command: $ nohup scalene --outfile test_profile.json --json mainDir/subDir1/entry_point.py --Arg1 --Arg2 --Arg3

When I do this no files in subDir2 are profiled.

When I add the argument --profile_all to the previous command, the only files profiled are entry_point.py and some internal python libraries.

If I add the argument --profile_only 'mainDir' it still does not profile subDir2.

Expected behavior Unsure, I can see the logic why subDir2 is not included by default. However, the profile_all and profile_only options appear to be broken on scalene 1.5.6.

Desktop (please complete the following information):

Additional context Essentially I am after some clarity in how to deal with this issue.

My current workaround to profile the whole codebase was to create a new python file mainDir/scalene_hack.py have it call entry_point.py.

All this new file scalene_hack.py does is import entry_point.py and then call entry_point.py's main function.

JCrane512 commented 2 years ago

Oh and I would like to say I have found using scalene immensely valuable at work. Thank you very much!

dskhudia commented 2 years ago

@emeryberger +1 How can scalene profile a reasonably complex project that has import spread over several dirs?

emeryberger commented 2 years ago

Just to clarify, the commands are --profile-all, --profile-only, and --profile-exclude (note that the words are hyphen separated — using underscores means that they won't work at all).

By default, Scalene only profiles code in the directory of the program being profiled (and any subdirectories), unless one or more of these options are present.

I hope this helps!

dskhudia commented 2 years ago

Thanks for the help on these options.

Is it supposed to work on a project installed in "editable" mode with pip? I tried on such a project (https://github.com/mosaicml/composer) and returned profile was very sparse (attached). I used --reduced-profile --profile-all options with scalene.

Let me know if you want a reproducer and I can send the instructions.

Screen Shot 2022-04-15 at 1 13 55 PM Screen Shot 2022-04-15 at 1 13 46 PM

emeryberger commented 2 years ago

If you use --profile-all, it really means everything, including Python's own libraries (here, threading.py) - maybe add --profile-exclude python3.8 so it will ignore Python's libraries. (And yes, always helpful to have a repro case!)

dskhudia commented 2 years ago

Yes. I understand. I was expecting other code called by trainer.fit, for example, to show up in the profile but it doesn't. Let me get you a small repro.

dskhudia commented 2 years ago

While creating a small repro I realized that usage of --reduced-profile was hiding most of the code from being reported as most of the time was spent on GPU. With the small repro, I do see values reported at different lines. In any case, here is the repro.

git clone https://github.com/mosaicml/composer.git composer
cd composer/
python3 -m venv venv
source venv/bin/activate
pip install -e .
pip install scalene
# Run the following to make sure it works 
python3 examples/run_composer_trainer.py -f composer/yamls/models/classify_mnist_cpu.yaml --validate_every_n_epochs -1

# Profiling command for cpu training
scalene --reduced-profile --profile-exclude python3.8 --profile-all --outfile results.txt examples/run_composer_trainer.py -f composer/yamls/models/classify_mnist_cpu.yaml --validate_every_n_epochs -1

# Profiling command for gpu (if available)
# may need to update pytorch 
# pip3 install torch==1.10.1+cu113 torchvision==0.11.2+cu113 torchaudio==0.10.1+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
scalene --reduced-profile --profile-exclude python3.8 --profile-all --outfile results.txt examples/run_composer_trainer.py -f composer/yamls/models/classify_mnist.yaml --validate_every_n_epochs -1
emeryberger commented 2 years ago

Note: needed to add --datadir datadir for this to work. Working on it, thanks!

chen-liang-323 commented 11 months ago

+1 on this, looking forward to the updates

chen-liang-323 commented 11 months ago

I think the program-path is the argument that we need for this, specifying the program-path to mainDir should work in this case. Can't find out starting from which version this argument gets added in