Dynesty underestimates the (cov)ariance

minaskar commented 4 years ago

It seems to me that for highly correlated, however simple, target distributions Dynasty does not recover the "true" covariance, with no indication that something is wrong.

You can reproduce this by sampling from a highly correlated 30-dimensional normal, setting the diagonal of the covariance to 1.0 and the non-diagonal elements to 0.95.

Dynasty systematically underestimates both variance and covariance in that case (i.e. estimated diagonal elements are 0.92 and non-diagonal are 0.87).

joshspeagle commented 4 years ago

Could you give me an example of this? I ran through examples of exactly this problem and found no such underestimation. A simpler version is also one of the unit tests the code has to pass every commit.

minaskar commented 4 years ago

Hi Josh,

You can find an example in this Google Colab notebook:

https://colab.research.google.com/drive/185dr7zEhr34sGZS52F5NAmdSDf_0yhn0

Unless I did something wrong, dynesty generates a biased estimate of the covariance.

joshspeagle commented 4 years ago

The notebook looks good. When I ran tests I also got underestimates which were consistent with yours.

As is good practice with any sampler, I tried two simple fixes: (1) increasing the number of particles ("live points", "chains") from 500 to 1000 during sampling and (2) increasing the number of slices (the default sampling strategy is slice sampling for 30 dimensions, per the docs) from 5 to 10. This should confirm whether it's just a sparse sampling problem or an auto-correlation problem (see #163), respectively, or whether something more insidious (and worrisome!) is going on.

Luckily, turns out it appears doing (1) pretty much fixes things -- I find no underestimate that way. Doing (2) also improves results, but still underestimates things by ~3%. So looks like the sampling is just non-optimal and correlated, partially driven by bad covariance estimates during runtime (used in slice sampling). In general, you want the number of particles to be ~D^2 to get a good covariance estimate while sampling, so this makes sense to me.

Hope this helps!

minaskar commented 4 years ago

Increasing the number of live points from 500 to 1000 doesn't make a difference at all. The covariance is still underestimated.

joshspeagle commented 4 years ago

I ran multiple trials with both options copying the code from your notebook directly onto my machine before confirming my reply. I also adjusted the bounds from [-10, 10] to [-5, 5] to also check possible issues with the larger prior box, and experimented with the standard NestedSampler. The basic logic I outlined in the response should still hold, although I'm a bit confused why you're getting different results. :/

Can you confirm you adjusted both "live_init" and "nlive_batch" as well as "slices"? The former happens during "run_nested", while the latter happens on initialization. If that all looks good, then I'm not sure what's up.

minaskar commented 4 years ago

I can confirm that I adjusted both "live_init" and "nlive_batch" as well as "slices". In particular, I made the following adjustment to the aforementioned Google Colab notebook:

dsampler = dynesty.DynamicNestedSampler(loglike, ptform, ndim, slices=10)
dsampler.run_nested(wt_kwargs={'pfrac': 1.0}, nlive_init=1000, nlive_batch=1000)
dresults = dsampler.results

I also tested this on 3 different machines multiple times. Everytime, the covariance is systematically underestimated by 3-4%.

minaskar commented 4 years ago

I've also noticed something else that might be related to this (though this could be something really trivial and/or completely unrelated).

Why is the estimated variance in the 200-D Normal example 0.5 instead of 1.0?

joshspeagle commented 4 years ago

I can confirm that I adjusted both "live_init" and "nlive_batch" as well as "slices".

Weird. I did find biases at the ~3% level in some cases, but I also found results that were clearly unbiased. I'm not sure why that's happening, but I guess the implication is that there might be systematics at the ~few % level for problems with very dense covariance matrices and strong correlations.

Why is the estimated variance in the 200-D Normal example 0.5 instead of 1.0?

This is actually as intended (the correct answer is 0.5). In that particular problem, to avoid issues associated with sampling a truly enormous prior volume I take the prior to also be standard normal. This gives a final variance of 1/(1+1) = 1/2, which is accurately recovered. Note, however, that this is an entirely illustrative toy problem; in practice, dynesty's efficiency and performance on most realistic, moderately-dimensional problems (~10s of parameters) declines rapidly with dimensionality.

minaskar commented 4 years ago

This is very weird. I also tested this using 50 slices and the estimated covariance is still underestimated, which doesn't make any sense.

Another thing is, whether this bias also appears in lower-dimensional problems (around 10), in the presence of non-linear correlations.

joshspeagle commented 4 years ago

Another thing is, whether this bias also appears in lower-dimensional problems (around 10), in the presence of non-linear correlations.

This has been tested extensively by myself and others and found to be pretty good, especially with uniform sampling strategies. So I don't think it should be a concern, but given how weird the results are right now who knows lol.

joshspeagle / dynesty

Dynesty underestimates the (cov)ariance #172