CosmoStat / jax-lensing

A JAX package for weak gravitational lensing
MIT License
27 stars 3 forks source link

Posterior convolution for ODE sampling #51

Open EiffL opened 2 years ago

EiffL commented 2 years ago

Ok, I've done some math to try to figure out what a convolved posterior might look like.

Let's start from here: image then, the product of these two Gaussians has the following parameters: image

Now, let's say we want to convolve this gaussian with sigma{tfg}, this is the noise we want to add to the full posterior. We can try to add noise to f for instance, and see how much noise sigma{tf} we should add to f, to achieve the desired amount of noise on fg. After some math, you find that: image

So notice here that this will only work if σg is larger than the noise we add. I tested this on this notebook: https://colab.research.google.com/drive/1-900p399fHa4hnN8bd_OmDg4mlsbrzZZ?usp=sharing

image

The blue line is orginal fg, solid line is analytically tempered fg, and dashed is original g multiplied by tempered f with the above amount of noise.

So this works, but 2 annoying things:

EiffL commented 2 years ago

luckily in our case, the mean of the prior is zero at least.... probably makes things a little simpler

b-remy commented 2 years ago

Here is the demo when adding noise to the two distributions (assuming zero means): https://colab.research.google.com/drive/193mj8kGPNrZTwHwheT_gWkURK6-duTOp?usp=sharing

and the formula I derived: image

EiffL commented 2 years ago

Sooooo, I've done some additionnal experimenting, and made a few obersrvations

  1. Averaging samples from HMC posterior stopped before 0 temperature doesnt converge to correct posterior mean Here is an example with min_temp=0.03: image

  2. If we use HMC sampling with only prior term, looks like taking mean of samples behaves correctly and converges to zero. image

=> This would mean we need the HMC sampling to go all the way to zero temperature, because if we stop it before, we are not actually sampling the expected convolved posterior

EiffL commented 2 years ago

Small question @b-remy, what is the smallest temperature we can reach with the full non-gaussian prior?

b-remy commented 2 years ago

It seems that we can reach any temperature we want at the end of the annealing, or at least the annealing trace returns the input min_temp like 1e-4, 1e-5 or 1e-6 if we let the chain running for a sufficiently long time.

However, reaching a very low temperature does not seem to be enough to perfectly recover the Wiener estimate. For instance, here is the results where, in the first experiment we reach 1e-4 temperature, and in the second experiment we reach 1e-6 temperature. Both posteriors mean agree while not perfectly matching the Wiener estimate.

image

Capture d’écran 2021-10-25 à 14 16 33

(50 samples to compute the mean ^^^^)

@EiffL , you may notice that the power spectrum is much closer to what we observed last week, which is is due to:

EiffL commented 2 years ago

So, what is the smallest temperature we can reach with the full non-gaussian prior?

b-remy commented 2 years ago

The smallest temperature I managed to reach when fine-tuning one chain is 0.007. I tried the same parameters with batch size of 10 and also get pretty good samples visually and regarding their power spectra. I don't have the plot yet because the notebook is still running but the average map turns out to reach better metrics: rmse=2.24 x 10-2 and r=0.65.

I will update here the results with 20 samples

b-remy commented 2 years ago

Well, same metrics with 20 samples: here are the plots

Capture d’écran 2021-10-26 à 23 08 34 Capture d’écran 2021-10-26 à 23 06 49 Capture d’écran 2021-10-26 à 23 06 28
EiffL commented 2 years ago

That looks really really good no :-) ?

b-remy commented 2 years ago

Yes pretty good :-), but as seen in both Gaussian and full prior examples we still cannot "perfectly" sample the whole posterior.

Maybe we will want to smooth the very small scales as you suggested, assuming that the SNR is way too low to contain any information. That's what I'm thinking about currently.

EiffL commented 2 years ago

ok cool, we don't necessarily need to do anything. We can just use that sampling strategy, apply it in the Gaussian case, and state that we can recover the expected posterior mean down to some given scale