RSE-Sheffield / pando-python

Performance Profiling & Optimisation of Research Code (Python) - Short Course
http://rse.shef.ac.uk/pando-python/
Other
4 stars 3 forks source link

Notebook profiling #23

Closed Robadob closed 7 months ago

Robadob commented 8 months ago

A student today appeared to be a python notebook user, they struggled with the command line.

I should investigate how well these profiling tools can work within notebooks. I think they can be triggered programmatically. Unlikely to become the focus of those episodes, but would help these participants.

I think this may have been suggested by someone at the accelerated session too.

Robadob commented 7 months ago

From Neil https://towardsdatascience.com/speed-up-jupyter-notebooks-20716cbe2025

Robadob commented 7 months ago

Function Profiling

It appears %prun -D out.prof some_func() can be used to run a method via cProfile.

I guess the output file would then need to be downloaded and execute standalone via snakeviz.

Line Profiling

It must be installed and loaded.

!pip install line_profiler
%load_ext line_profiler

The blog then gives the example

%lprun -f estimate_pi estimate_pi()

Not too clear why estimate_pi is specified twice.

It goes on to mention a fancy heat map visualisation for it

All we have to do is insert the %%heat command at the top of the cell (to use load the extension %load_ext heat after installing with !pip install py-heat-magic) and it allows us to see the completely red while loop obliquing high cost of CPU-time, clearly showing room for optimization.

Benchmarking

%timeit exists, much the same as using it via command line.

I'll test out these approaches later in the week to write some content. Not currently planning on updating the exercises to provide notebook versions.

Robadob commented 7 months ago

I found that you can call snakeviz directly inside a notebook, bypassing the need to also call cProfile. This produces the same visualisation website inside an output cell. Filenames/line numbers are crap because they're temporary things created by the notebook, but can't do much about that.

https://coderzcolumn.com/tutorials/python/snakeviz-visualize-profiling-results-in-python