CosmoStat / autometacal

Metacalibration and shape measurement by automatic differentiation
MIT License
4 stars 1 forks source link

Revisiting the noise response #20

Open EiffL opened 3 years ago

EiffL commented 3 years ago

I'm opening this issue to document our exploration of the noise contribution to the Jacobian.

So, the issue is this, we know from previous metacal papers that the presence of noise in the image can introduce not only a scatter, but more annoyingly a bias in the computation of the Jacobian. This bias is a function of the SNR of the object, and also depends on the exact shape measurement method used.

The standard practice is to add an extra noise component in the metacal image, which will receive a negative shear response, in the hope of compensating for the introduced bias. This comes at the price of higher noise in the metacal image, and is also not guaranteed (as far as I know) to completely cancel the noise response.

So, let's think about this... We are interested in the quantity

The first term is just how the shear measurement responds to the input noisy image, but no shear is introduced on the noise at that stage, the second term is how the noisy image responds to an introduced shear. That second term is linear, so you can split the true image + noise component.

Here are a few illustrations. First, the second term, response of the image with the addition of a shear, because that part is linear, I can separate what happens to the noise component and the galaxy image: image image image These images are for a high signal to noise galaxy, from top to bottom: noise, gal, gal+noise. And the columns are metacal image, dI/dg1 dI/dg2.

So we see that indeed the noise has a non-zero contribution in this part of the derivative.

And now we can also try to look at what d e/ dI looks like, here I'm using a simple moments method, applied on the noisy metacal image, here is what this looks like: image Left is for e1, right is for e2. The interesting thing to notice here, is that the noise affects a bit this response but it's completely averaged out, so this term is pretty smooth.

Visually we kind of understand what might happen, the noise contribution in dI/dg should might not average out when reduced over the pattern in de/dg because the noise response dI/dg(n) is not isotropic.

EiffL commented 3 years ago

Next thing we can look at are the residuals in R between a noiseless image and an image with many different noise realisations. Here are the residuals for a given fixed galaxy image" image image

We see a significant scatter in the response as induced by the noise, but a fairly small bias. In this instance, the bias R00 is 0.0018 and for R11 is 0.0022 , while the fiducial values for R00 is 0.351 and R11 is 0.316, so yeah of the order of 1%. But this is very image specific.

EiffL commented 3 years ago

Now, where things might start to get fun :-) to debias gradients, you don't necessarily have to degrade the entire image in the first place. We can look at each term of the gradient i.e. de/di and dI/dg, and I would wager that it's dI/dg(n) that causes problems, so we can try to correct the gradients at this level alone, which might introduce more variance in the Jacobian, but remove this bias, and without adding additional noise to the image itself!

esheldon commented 3 years ago

Nice plots @EiffL !

I did try some alternative methods that would add less noise. See Appendix A in Sheldon & Huff for a description of two.

In addition to those methods I also tried a method using deeper data. This was for metacal not metadetect.

  1. Take deep fields and add noise to them so there are two versions, a noisy and less noisy version that matches some shallow data.
  2. Run detection on the noisy versions so detection and deblending is like shallow data.
  3. Using those detections, measure responses for the both noisy and low noise versions.
  4. Compute a correction to the responses by comparing noisy and low noise.

I don't have numbers handy but I recall it worked fairly well. I put off writing this up because I didn't want to introduce a dependence on deep fields if not necessary, and because there are a lot of issues to make it work. For example one would also need to generate versions of the deep fields with seeing tuned for specific shallow seeing data etc.

EiffL commented 3 years ago

Thanks @esheldon! I experimented some more and double checked the appendix in your paper.

So, as a reminder, I'm considering the response on a noisy image like so de/dg(I + n) = de/dI (I+n) [ dI/dg(I) + dI/dg(n) ], and assuming biases arise because we have a non-zero dI/dg(n) (non zero response jacobian of the noise image with respect to shear). So the question is, how to evaluate the bias term de/dI (I+n) dI/dg(n).

I think I've seen two interesting effects.

  1. I was naively expecting <d e/dI(I+n1) dI/dg(n2) > = <d e/dI(I+n1) dI/dg(n1) > i.e. that this part of the jacobian didn't depend on having the same noise realisation in the derivative of the shear measurement and the derivative of the noise image. If that had been the case, we could have estimated this term by running an extra noise simulation just to get dI/dg(n2), and we could have used it to cancel the term d e/dI(I+n1) dI/dg(n1). But instead, my small tests seem to indicate that <d e/dI(I+n1) dI/dg(n2)> = 0 if n1 != n2.

  2. If that is indeed the case that the gradient <d e/dI(I+n1) dI/dg(n2)> = 0, it would mean that if we add extra noise in the image, it doesn't correlate against another noise component. So, we can add a noise n2 with a small sigma_2 (doesn't need to be the same as n1) and we can compute <d e/dI(I+n1+n2) dI/d(-g)(n2)> we know this term should scale with noise level, so if we rescale it by sigma_1/sigma_2<d e/dI(I+n1+n2) dI/d(-g)(n2)> we should be very close to estimating the compensation factor for n1. I think this is pretty close to the A2.2 strategy in Sheldon&Huff , but maybe with an explicit computation of most of the A factor. If it holds, essentially, this would be a way to do the "standard" noise correction, but with only adding a fraction of the image noise, instead of adding exactly the same amount. So probably could get away with a very small increment in overall noise level in the image.

I only got time to do a quick test yesterday, but here are some illustrations. What I'm doing here is that I'm first estimating the bias correction term that comes from the addition of a noise of same level as the image noise so that's d e/dI(I+n1+n2) dI/dg(n2). Then I do the same, but only add a noise level 0.1x the input noise standard deviation, I compute the same term d e/dI(I+n1+n2) dI/dg(n2) and I multiply it by 10. For a few noise realisations on a single image, this is what I get image where the x axis is the traditional bias term coming from n2 with same noise level as input image, the y axis is this rescaled bias term coming from introducing a smaller noise level in the image.

Still pretty dodgy results... but might indicate that we don't have to add noise with identical variance as in the input image. Because we control this new noise term, we can use it to probe the noise response, and then rescale that response to cancel that of the original noise in the image.... Again... not sure about this, I'll do more tests, just wanted to document.

beckermr commented 3 years ago

@aguinot Just mentioned this thread to me. I have a student working on metadetect w/ deep fields as a summer project. I will report back as he makes progress.