Investigate correlated noise from metacal and strategies to whiten it

rmjarvis commented 8 years ago

The naive way to write metacal and whiten the noise doesn't work.

Erin found a strategy that does, but when I adapt it to running on a non-square wcs, it still doesn't work if the wcs has any shear component.

This issue is for investigating how to do this correctly. Possibly involving edits to GalSim code. Or possibly not.

rmjarvis commented 8 years ago

@esheldon @rmandelb Here is a view that shows the code I've written to test several different strategies to whiten the noise induced by metacal.

https://github.com/GalSim-developers/GalSim/compare/master...%23720

So far, the only real successful strategy is Strategy #3, which is the one Erin mentioned in his talk. However, so far, I haven't gotten that strategy working when the WCS has a shear component. I did get it to work if the WCS has a rotation though.

Also, it's written as a unit test, but I've turned off the asserts for the moment, so strategies 1 and 2 just report as failures, rather than aborting.

I'm keep working on trying to figure out how to get strategy 3 to work with arbitrary wcs. It seems like that's the first order of business, since it's what we need to run metacal on real data. But Rachel, if you have any ideas for how to make strategies 1 or 2 working, please feel free to chime in.

rmandelb commented 8 years ago

Is that a bug-fix in wcs.py? (the g2 to g1 replacement) If so, should we cherry-pick that elsewhere, so as to get it in place before this issue is resolved?

rmjarvis commented 8 years ago

It's an error in the doc. Not a bug.

rmandelb commented 8 years ago

Oy. Right.

rmandelb commented 8 years ago

I just looked through and have questions based on the comments:

1) Strategy 1 - I'm surprised you're finding symmetrizing works even with a wrong description of the correlation fcn. But I'm also surprised that this entire thing is necessary, because I would think that imposing symmetry on the noise (not whitening) is all that's needed for MetaCal to work. Do you understand this? Yesterday I was relieved because I thought you meant neither symmetrization nor whitening was working, but now I am confused again. For multiple reasons. :)

2) Also strategy 1 - can we use gsparams to force a higher maxk and see if that fixes the issue? It would be nice if the fix was as easy as that... but I'm guessing you may have already tried this?

rmjarvis commented 8 years ago

I think it's probably a red herring that symmetrize works to the precision I'm testing. It doesn't work for non-square pixels. So it may just be a spurious result for some of the choices I made that symmetrize happened to pass the test.

rmjarvis commented 8 years ago

can we use gsparams to force a higher maxk and see if that fixes the issue? It would be nice if the fix was as easy as that... but I'm guessing you may have already tried this?

No. The problem is slightly different than I described. The problem is that for a convolution, we take the maxk to be the minimum of any of the component maxk values, the idea being that if it's effectively zero above that k for one of them, then it is also zero above that k for the product of all the items.

However, in the metacal case, above some k, the PSF is effectively zero. But Deconvolve(psf) is effectively infinity. So the product is near unity. So the real maxk to use is from the rest of the items aside from these two. Namely the "profile" of the noise correlation function, which is an autoconvolution of a pixel. That has a much higher maxk, so the value comes out badly wrong when you use the maxk of the PSF.

rmandelb commented 8 years ago

Ah. So if we want to handle this internally, we would need to have a way to tell it to basically choose maxk differently. Do you think this is worth pursuing? Do you plan to try it or should I play around with it sometime, or did you already try it and it's hopeless? :)

rmjarvis commented 8 years ago

Well, I don't think it's completely hopeless, but here's why it's hard: The normal evaluation of the autocorrelation of a Pixel uses the real-space convolution, since there are hard edges. This is really fast, since a Box convolved with a Box is about the easiest thing for the real-space convolution to do.

And really really hard to do in Fourier space. You need to go to really high k.

Adding on the psf convolution and deconvolution and the shear doesn't really change how high k you need to go to. But it does mean that the real-space convolution option is out the window. So it's hard to get this right even if you could easily force the value of maxk to be something appropriate.

So I feel like to make progress on that we might need some new clever insight that I haven't thought of. Maybe some kind of first-order perturbation to the known result of the box autocorrelation. It should be close to that, so maybe the delta from that answer is something that doesn't require high k? I haven't really tried to work anything out in this direction though.

esheldon commented 8 years ago

I think the pixel is not being treated correctly

psf = galsim.Gaussian(fwhm=0.9).shear(g1=0.05, g2=0.03)
psf_target = psf.dilate(1. + 2.*dg)

the psf is never pixelized, yet it is used to deconvolve the image. Here is a condensed version

obs_image = galsim.Image(im_size,im_size, init_value=0, wcs=wcs)
obs_image.addNoise(galsim.GaussianNoise(rng=rng, sigma=math.sqrt(noise_var)))
ii = galsim.InterpolatedImage(obs_image, pad_factor=1)
sheared_obj = galsim.Convolve(ii, galsim.Deconvolve(psf)).shear(shear)
final_obj = galsim.Convolve(psf_target, sheared_obj)
final_image = final_obj.drawImage(obs_image.copy(), method='no_pixel')

I think this means the pixel is getting sheared, which isn't what happens the way Eric and I have been doing metacal. We would do the following simulation and metacal steps

pixel     = wcs.toWorld(galsim.Pixel(scale=1))
pixel_inv = galsim.Deconvolve(pixel)

psfobj    = galsim.Gaussian(fwhm=0.9).shear(g1=0.05, g2=0.03)
psf_image = psfobj.drawImage(...., wcs=wcs)

psf_ii       = galsim.InterpolatedImage(psf_image)
psf_ii_nopix = galsim.Convolve(psf_ii, pixel_inv)

# dilate the psf and put the pixel back
psf_target_obj = psf_ii_nopix.dilate(1.0 + 2.0*dg)
psf_target     = galsim.Convolve(psf_target_obj, pixel)

# deconvolve the pixelized psf and shear
sheared_obj    = galsim.Convolve(ii, galsim.Deconvolve(psf_ii)).shear(shear)

# galaxy gets convolved by the pixelized target psf
final_obj   = galsim.Convolve(psf_target, sheared_obj)
final_image = final_obj.drawImage(obs_image.copy(), method='no_pixel')

esheldon commented 8 years ago

It would be nice to see another feature here in the test. This is when we shear the psf, but not the galaxy, to get a correction for the psf anisotropy.

It would be nice to see scenario 4 work for this case as well.

I can send pull requests for this sort of thing if desired.

rmjarvis commented 8 years ago

I think what I have there now is not wrong for the purpose of testing the correlated noise, in the sense that ideally the code there should also work for making the final noise uncorrelated (but doesn't except for strategy 4)

But I accept the critique that this isn't actually what metacal does. We could add another test function that works with a pixelated version of the PSF (rather than a pristine one) and handles the pixels the way you do in the real metacal algorithm. It would indeed be useful to check that strategy 4 still works in that case.

Likewise with a dimage/dpsf test.

If you want, would you like to be added as a GalSim collaborator, so you could just push to this branch? That's probably easier than pull requests if you're willing to work on this with me. :)

esheldon commented 8 years ago

OK, I accept

rmjarvis commented 8 years ago

Great! Invite sent.

rmjarvis commented 8 years ago

Giving an update here, since this might be of more general interest. (Erin and I have been discussing this offline.)

@esheldon discovered a problem when he converted the test script to match what he really does in metacal, where he computes a pixel-free PSF, dilates that for the target PSF, and then convolves by the real pixel again.

When we run this version with a fairly complex WCS, the noise in the image blows up. Like variance = 1.e6, rather than <1 in the previous version. It only happened if the WCS is complex enough. For things with ~squarish pixels, there was no problem. But above a certain threshold, the noise would suddenly explode.

I've tracked down the problem to small values being inverted by SBDeconvolve. The values were small, but not precisely zero, and then because the target PSF gets expanded by (1+2g), the small kvalues in the original didn't precisely match the places where the target PSF is near zero. So it doesn't get nulled out in the product.

My solution (993044f1eb13d577e27d6f48fed23b5059ed8387), which seems to work is to have SBDeconvolve calculate a minimum value to trust based on the GSParams kvalue_accuracy (= 1.e-5 by default). min_acc_kval = kvalue_accuracy * flux_adaptee Then anything closer to zero than this doesn't go to 1/kval, but rather goes to 1/min_acc_kval.

I think this is valid, since these k-values are actually uncertain at the level of min_acc_kval, so we don't actually know it to better than this. It seems appropriate to use the minimum allowed inverse rather than let it produce values of ~1.e9 that you can get by just doing 1/kval.

esheldon commented 8 years ago

Great, thanks for this. I'll give this a run on a simulation with static, complex wcs.

rmandelb commented 8 years ago

Thanks for the update, Mike. I have been wondering! Your solution makes sense to me.

rmjarvis commented 8 years ago

Oh, this doesn't fix the problem with whiten and symmetrize not working. That's still an issue. It just fixes a case where the "do the same thing to a noise field, rotate it 90 degrees, and add it to the image" strategy didn't work.

rmjarvis commented 8 years ago

Actually, whiten technically works, but only because it adds so much noise that the asymmetry in the correlations are small relative to the dc term. The symmetrize tests still fail when the wcs and psf are not symmetric.

I think I understand why strategy 1 doesn't work. This is where the noise model is kept track of automatically. The estimate of the noise correlation function gets too small a maxk from the convolution by the PSF. I don't have an easy fix to this, but at least I think I understand why it fails.

For strategy 2, I compute the correlation function from a noise field directly and then use that to symmetrize. I don't yet understand why this one fails. That might be my next task on this issue. I think we should be able to make that one work.

However, I think strategy 4 will still be preferred in practice, since it actually adds less noise than either strategy 1 or 2. And it's faster than strategy 2. So unless we can get strategy 1 working (which is the fastest method), there doesn't seem to be any reason to switch.

esheldon commented 8 years ago

With the new code I'm still seeing bias in my sim, that I do not see when the wcs is simple. My wcs is

dudx:  0.12
dudy:  1.10
dvdx: -0.915
dvdy: -0.04

I put this into test_metacal.py, rescaling by 0.26 so the overall scale is similar to yours.

With the default PSF size the test still passes. But I had smaller psfs (and larger) in my sim so I tried varying that.

If I set the fwhm=0.79 the test fails. So it seems there might be some additional instability for smaller psfs.

esheldon commented 8 years ago

I tried with fwhm=0.8 and it did not fail, so below 0.79 (~3.04 pixels) seems to be a problem for this wcs

rmjarvis commented 8 years ago

Probably because the PSF is no longer Nyquist sampled when fwhm = 0.7. So an InterpolatedImage is not going to be able to give you an accurate approximation of the true PSF. I don't think this points to a bug in GalSim anymore. The noise is larger in that case, but not blowing up the way it used to. So I think this is probably just a limitation of the metacal method on undersampled PSFs.

esheldon commented 8 years ago

It was 0.79 not 0.7 (I had a typo in my original comment, fixed it now).

What you say sounds plausible. There might still be room for improvement though. If I use square pixels at the same scale, the test "passes" even for fwhm=0.70

esheldon commented 8 years ago

Another anecdote: as noted, I've run the same simulation with square pixels and it had no bias.

rmjarvis commented 8 years ago

Switch from the definite seed to seed = 0 and run it 20 times. You'll see that sometimes is passes and sometimes (more often) it doesn't. And when it doesn't, the sense of the anisotropy keeps changing. So this is just a noisy test. Not a systematic error.

When I did this (with fwhm=0.70), I got 9 failures with plus pattern diff relative to dc = + + - - 7 failures with plus pattern diff relative to dc = - - + + 4 passes

Interestingly, the cross pattern always passed.

esheldon commented 8 years ago

The off-diagonal terms are 10% so if square passes at 0.70, we might expect the non-square to fail at 0.8 ish (not nyquist along one dimension)

esheldon commented 8 years ago

I did a sim using real wcs from DES, including actual variations seen in the data. I don't see any obvious problems.

rmjarvis commented 8 years ago

Great. :) I'm still investigating the other strategies, since I feel like they should work, and the fact that they don't I think is a bug still in GalSim. But at least we have one strategy that seems to work well if the pixels aren't too distorted.

rmjarvis commented 8 years ago

@esheldon I just pushed a bit of code that might work as a more general prescription for how to pick a good target PSF. It looks for where the existing (sheared) PSF gets small, and then makes sure the target PSF is even smaller there. It works for the extreme WCS and small PSF test that you had put in.

I haven't experimented too much with what would be good choices of the different parameters (small_kval, smaller_kval), so there might be some research to do to make this more reliable. I also had it make a round target, rather than keeping the same ellipticity. Not sure whether this is better or worse in practice.

esheldon commented 8 years ago

Not sure if this is related to this branch or not, but I'm getting errors like this running on some real data

python: src/SBInterpolatedImage.cpp:569: void galsim::SBInterpolatedImage::SBInterpolatedImageImpl::calculateStepK(double) const: Assertion `std::abs(flux - fluxTot) < 1.e-3 * std::abs(fluxTot)' failed.

and this results in an abort rather than an exception being raised, so the program crashes outright, which is a bit problematic.

rmjarvis commented 8 years ago

Can you provide a concise example that causes this to happen? This indicates a bug in how the flux is being calculated somewhere, since this shouldn't ever fail.

esheldon commented 8 years ago

Its buried pretty deep, it will take time to find a concise example. I never saw this in the sims I was running, but I'll try. The short answer is this happens in the metacal steps drawing the sheared reconvolved image.

esheldon commented 8 years ago

It turns out the image was entirely zeros.

I add a noise image to cancel the correlated noise. In these cases the weight map was entirely zero, so no noise was added, so it was an empty image.

I am now throwing out these images, not measuring them. But should the interpolator handle this gracefully?

rmjarvis commented 8 years ago

I made two changes in response to this. 439b55f

I changed the < to <= which should work even the total flux is zero.
I also raise a RuntimeError if the total flux in an image is exactly zero, since I think that will usually be a sign of some kind of user error.

rmjarvis commented 6 years ago

Closing this, since we do have an effective strategy for the original problem. We still don't completely understand why some of the other methods don't work as well as expected, but I don't think there is any pressing need to figure this out.

GalSim-developers / GalSim

Investigate correlated noise from metacal and strategies to whiten it #720