aimalz / chippr

Cosmological Hierarchical Inference with Probabilistic Photometric Redshifts
MIT License
1 stars 5 forks source link

Simplify/speed up analysis #28

Open aimalz opened 7 years ago

aimalz commented 7 years ago

Travis tests are taking a long time with the current demo notebook. I'm going to simplify the tests in the notebook and use separate, local scripts to run more meaningful tests. Similar to #16, I'm going to return to the same branch

aimalz commented 7 years ago

The solution to this one might be to choose a different sampling scheme altogether. @drphilmarshall proposed using an importance sampler as an alternative to emcee. The first step is checking that the final simplex from the MMLE is appropriate.

drphilmarshall commented 7 years ago

For this to work, the multivariate Gaussian defined by the inverse hessian output from the optimizer needs to be a reasonable approximation to the posterior PDF - but you have the samples to test this.

On Mon, Jun 19, 2017 at 1:59 PM, Alex Malz notifications@github.com wrote:

The solution to this one might be to choose a different sampling scheme altogether. @drphilmarshall https://github.com/drphilmarshall proposed using an importance sampler as an alternative to emcee. The first step is checking that the final simplex from the MMLE is appropriate.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/aimalz/chippr/issues/28#issuecomment-309572126, or mute the thread https://github.com/notifications/unsubscribe-auth/AArY94FIVS6BtZpQsUhADVbVy4gJKK5Uks5sFuFJgaJpZM4Lg_n1 .

aimalz commented 7 years ago

Answer: no, the optimizer's covariance is not similar to that of the MCMC samples. estimators Does this kill the importance sampling idea?

drphilmarshall commented 7 years ago

Not necessarily, but it makes it riskier.

Is your plot showing the Laplace approximation Gaussian posterior estimate (from the optimizer inverse hessian) in cyan and the sampled posterior in blue? I will assume so and say:

If you could convince yourself and your readers that the optimizer consistently and reliably returned an inverse hessian matrix that gave a Laplace approximation to the posterior that was always 5 times broader than the true posterior, you could rescale the inverse hessian by a suitable factor (like sqrt(5) or perhaps a bit less) and then use it as a biased distribution for importance sampling. (You don't want your biased distribution to be many times broader than your target posterior.) In each of your tests, what is the ratio of the variance of the Laplace approximation to the variance of the true posterior, in each bin? (You could plot this ratio in the z bins, one line per test).

aimalz commented 7 years ago

Your assumptions about the color scheme are correct. Other important information I failed to include: In order to obtain the inverse Hessian, I had to change the optimizer algorithm from Nelder-Mead to BFGS. It claims to not have succeeded due to loss of precision but nonetheless consistently gives the same result for the parameters, the posterior probability, and the inverse Hessian.

I'll add in plotting the variance ratio over all the bins for each test case and get back to you with the results!

drphilmarshall commented 7 years ago

Good, thanks! One last thought: I don't know what the status of emcee's multiprocessing/multithreading support is (or if you are exploiting it if it exists) but you should be able to use multiprocessing pretty effectively when importance sampling, because each sample's importance evaluation can be done embarrassingly parallel-y.

aimalz commented 7 years ago

Unfortunately, the per-bin variance ratios between different test cases are drastically different. It's not even consistently >1. The figures are now updated here.

I'm not using emcee's multithreading right now because it ends up being slower than not using it. There's a fair bit of overhead, so it only pays off when the function evaluation (the posterior probability) is very slow, which apparently isn't chippr's limiting factor. I do parallelize running the different test cases and could probably parallelize some other functions (under the likely wrong assumption that nothing bad happens when you run multiprocessing on an already multithreaded function).

drphilmarshall commented 7 years ago

Hmm - looks like the optimizer is not returning a very reliable inverse hessian at all! I guess I am not very surprised. I notice from your figures that the per-bin hyperposterior width does look fairly predictable though, which makes me wonder whether you could make a rough prediction of it using simple error propagation arguments, that you could then use to seed an importance sampling. What if you used constant X% uncertainty to define the diagonal elements of a covariance matrix of your assumed Gaussian biased distribution? I'm not sure if this will work well in the cases where the hyperposterior is very narrow - but then, I'm not sure I understand those yet. Do you?

aimalz commented 7 years ago

I'm not sure I agree that the per-bin hyperposterior width is so predictable. Most of what I see is that error bars tend to be larger at low ln[n(z)] values because there aren't many galaxies there, and error bars tend to be smaller at the highest ln[n(z)] values because there are many galaxies there. This behavior depends on knowing the shape of the true ln[n(z)] and could very well change as I scale up the number of galaxies. What other predictions did you confirm?

drphilmarshall commented 7 years ago

I was thinking that the dependence of the hyper posterior width on the number of galaxies in the bin could be captured with some simple Poisson error model. You have the observed number of galaxies in the bin, N; this could be used as an input. A scatter plot of hyper posterior width vs sqrt(N) could help guide the design of such a model.

On Jun 26, 2017 10:36, "Alex Malz" notifications@github.com wrote:

I'm not sure I agree that the per-bin hyperposterior width is so predictable. Most of what I see is that error bars tend to be larger at low ln[n(z)] values because there aren't many galaxies there, and error bars tend to be smaller at the highest ln[n(z)] values because there are many galaxies there. This behavior depends on knowing the shape of the true ln[n(z)] and could very well change as I scale up the number of galaxies. What other predictions did you confirm?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/aimalz/chippr/issues/28#issuecomment-311128091, or mute the thread https://github.com/notifications/unsubscribe-auth/AArY97M09Z4cNIvdIKsEjROeG_JK4ZOGks5sH-wEgaJpZM4Lg_n1 .