PAHdb / pyPAHdb

A Python tool to decompose astronomical PAH emission into contributing PAH subclasses.
https://www.astrochemistry.org/pahdb/
BSD 3-Clause "New" or "Revised" License
7 stars 6 forks source link

Add code profiling to identify matrix operation bottlenecks #16

Open mattjshannon opened 6 years ago

mattjshannon commented 6 years ago

Would be useful to see if we can save some time during the matrix operations. Apparently the cProfile package is recommended, though I will look into this.

PAHdb commented 6 years ago

With running examplefits.py, the performance hit is really due to writing out the PDF with matplotlib. Possibly there are some improvements that can be done there?_ The example_fits.py takes 1:36 on my system with writing the PDF, 4 seconds when not writing the PDF-file ...

PAHdb commented 6 years ago

There appear to be several options to speed-up the PDF production. While there are several alternative plotting packages, they all come with other requirements, e.g., Qt. On the other hand, several people suggest to reuse axis, etc. to speed up Matplotlib.

[update] Might be good to put an example output image(s)/plot(s) in the README and usage ...

mattjshannon commented 5 years ago

It looks like snakeviz works quite well for profiling/visualizing the results (https://jiffyclub.github.io/snakeviz/).

It's installable via pip or conda, and you basically just use it thusly...

python -m cProfile -o profiling_results.prof example_fits.py snakeviz profiling_results.prof (opens in your browser)

PAHdb commented 5 years ago

Interesting. I've installed the package and ran the profiling. Unfortunately I'm not getting the nice graphs---it looks like it doesn't like space in path names ... Though, the bottleneck we have is still Matplotlib. You were mentioning before that reusing the axis for each plot could speed-up the PDF-output. My current thoughts are that moving away from Matplotlib might not be such a good idea given its widespread use among the community ...