exoplanet-dev / exoplanet

Fast & scalable MCMC for all your exoplanet needs!
https://docs.exoplanet.codes
MIT License
206 stars 52 forks source link

gp.predict eating harddrive #126

Closed jradavenport closed 3 years ago

jradavenport commented 3 years ago

When following the excellent case study for modeling starspot variability, modeling GJ1243 from Kepler, the xo.optimize works basically as expected (yay)!

I tried to then generate a prediction (mean and variance) at a slightly higher time resolution, 38k datapoints total... this caused exoplanet to eat the remaining ~50GB of my harddrive, and not release it until I restarted my machine. A new model was not produced...

Is this just a byproduct using giant arrays w/ Kepler data?

My code is here. https://gist.github.com/jradavenport/790fd58e8723acd621fbd712708379c8

The offending line is:

with model:
    mu1, var1 = xo.eval_in_model(gp.predict(xnew, return_var=True), map_soln)
dfm commented 3 years ago

This isn't really a bug! Computing the variance or covariance scales poorly with the number of data points and the number of points where you're predicting. I'd recommend 2 things:

  1. Only predict the mean if at all possible. If you can't, loop through the times where you want to predict in chunks and just predict a few at a time.
  2. Switch to using celerite2 since there are no GPs in exoplanet anymore! I've done some work to make the version in celerite2 should scale a bit better now (computationally and memory-wise).