kevin218 / Eureka

Eureka! is a data reduction and analysis pipeline intended for time-series observations with JWST.
https://eurekadocs.readthedocs.io/
MIT License
60 stars 47 forks source link

Eureka! Code Profiling #308

Open jbrande opened 2 years ago

jbrande commented 2 years ago

Instrument

No response

What happened?

We need to profile Eureka's performance in order to assess major bottlenecks and hopefully speed things up. This is especially important for Stage 3 and Stage 5. I think I can do this with Python's cProfile utility, so that's where I'll be starting.

Error traceback output

No response

What operating system are you using?

No response

What version of Python are you running?

No response

What Python packages do you have installed?

No response

Code of Conduct

jbrande commented 2 years ago

We routinely call something like 16 thousand individual methods in a single Eureka! stage, so profiling these has been a little more difficult than I originally though. I've got a couple of profiles of S3 done and I think I'm properly interpreting the data, and I've profiled S5 and the instrument-specific tests in a couple of ways but haven't gone through them yet.

Here's the results from a "typical" S3 run on the NIRSpec test data, and I set isplots=3. We don't necessarily need profiling to tell that when running S3, most of the time is spent making plots. However, it's always good to get this quantified, so here's a screenshot of the time breakdown. s3_plots_profile

Nearly two-thirds of the total time S3 runs, it's making and saving images, mostly repeating the same operations. We might want to look into speeding up plotting somehow, possibly with faster matplotlib backends, or making smaller/lower resolution plots? I'm not super up to date on matplotlib best practices, but it's an avenue to investigate.

Here's the breakdown when isplots=1: s3_noplots_profile

Things run much more quickly, and the "bottlenecks" here are the actual reduction steps, and some of the blocks that aren't big enough to be labeled actually take up quite a bit of time as they're repeated operations. Here I'm thinking of the _find_and_load method in importlib._bootstrap which takes up 32% of the 9 second runtime here.

I can also share the binary data profiles themselves, and I'm viewing these through Snakeviz (https://jiffyclub.github.io/snakeviz/)

@taylorbell57 , let me know if this is at all helpful or not.

jbrande commented 2 years ago

Looking at S5 emcee runs now. Most of the time spent is spent in likelihood.py, but a lot of this actually looks to be built-in python/numpy stuff we probably don't have much control over. However, about 50 seconds of the 135 seconds that we're in likelihood.py is spent in Model.py either evaluating or updating the models. In update(), a good chunk of this is spent reading EPFs, and parsing the coefficients.

s5_profile

In eval(), the biggest time sink seems to be evaluating the transit and systematics models. The Batman transit model takes more of this time than the polynomial systematics model, so perhaps when starry gets implemented, we might be able to speed up the transit lightcurves a bit.

taylorbell57 commented 2 years ago

This is super interesting Yoni, thanks so much for looking into this! I think improving the update function will help a lot and should be pretty doable. I've been thinking for a while that we're really not doing things quickly in that function, but it's nice to see just how big an effect this has.

For S3 I definitely agree that a tonne of time is spent on plotting. The plot resolution size might have some effect on this, but I wonder if that's the main issue or if there's something else we can do. I guess it is the savefig lines that are the slowest though, so maybe resolution really is the bigger concern. I wonder if PDF vs PNG significantly changes things and if there's any other optimizations like that we can do (like different backends as you'd mentioned).

taylorbell57 commented 2 years ago

I quite strongly suspect that a significant part of issue #223 is caused by the inefficient update functions in Stage 5, but it's possible dynesty just scales very poorly with the number of data and/or fitted parameters

jbrande commented 2 years ago

Yeah, getting into the dynesty profile it's similar to the emcee profile, except that the nested sampling prior transforms also take a good bit of time. s5_dynesty_profile

Reading EPFs, setting attributes, transit and systematics models all still seem to be where we have room to improve.

jbrande commented 2 years ago

Just a short update, the proportion of time spent in each operation in S5 is basically the same between instruments for a given set of transit/systematics parameters. It's not super surprising but it's nice to confirm this. I've tested this with NIRCam and NIRSpec, and now I'll be branching off into the different eclipse/phase curve/GP possibilities with MIRI.

taylorbell57 commented 2 years ago

I used this profiling information to speed up S5 with PR #434 and looking at the S3 profiling information I don't think there is anything overly worth trying to improve there. @jbrande do you think we should keep this issue open, or do you agree that we can close it now?

kevin218 commented 1 year ago

We should use Scalene, an AI based profiling tool that is available on GitHub.