Fix spectra coaddition and cosmic masking

segasai commented 2 months ago

This is an a actual PR fixing #2372

coveralls commented 2 months ago

coverage: 30.293% (+0.1%) from 30.164% when pulling f3bc12bcd550bbcdfaaf490b57540282054fc6d5 on resol_fix into 48c16414f681f34739f1d55e6f2997a4c0cc5d5f on main.

sbailey commented 2 months ago

Thanks. I confirm that the original code is a bug, i.e. we agree on the math of what should be done and that the current code does not do that. Since this implementation is a little different than the #2372 I'd like to study it a bit more and run some test tiles to check for corner cases (e.g. entirely masked spectra).

Ideally I'd like to also add some unit tests that would have caught the original bug, e.g. that a "coaddition" of a single resolution matrix should be the same as its input, and verifying what we expect the impact to be of a masked pixel on 1 out of N exposures, and the same pixel masked on N out of N exposures.

In issue #2372 you documented the impact on chi2; I'd also like to study the impact on final redshifts to decide what to do about Kibo (e.g. a coadds+redshifts-only supplemental run). But that doesn't have to be a blocking factor for wrapping this up.

segasai commented 2 months ago

Thanks for looking into that.

As far as I tested, this implementation should give the same result as the previous one. I just felt it was a bit clearer. but for sure it's worth double checking. I agree this bug also worth a regression test. The simplest one is indeed with one masked pixel and one exposure.

FWIW these are my runs https://data.desi.lbl.gov/desi/users/koposov/coadd_test/output/ on random hpx in kibo using this patch. So the redshfits can be directly compared to kibo. (this used earlier version of the patch, but AFAICS it should be exactly equal to what was before).

segasai commented 2 months ago

I've added the test that trips the current desispec but passes after the PR. (coaddition of one spectrum with one masked pixel)

segasai commented 2 months ago

I've added a final (IMO) test that verifies if one has 'true 'spectrum M, Observations with 'random' resolution matrices R_i, random covariance matrices and extracted spectra S_i (obeying S_i= R_i M), then after running coadd the relationship S_coadded = R_coadded M still holds exactly. This test also obviously fails with existing desispec. With this test I am actually very confident that the new code is correct.

sbailey commented 2 months ago

Thanks, this all looks good. In the spirit of not making a major change just before the weekend, I will wait until Monday to merge this, but I don't see any problems. In the meantime, two more tests to request to really drive this home:

coaddition with one of the exposures entirely masked
coaddition with a single edge pixel masked (instead of an interior pixel), i.e. test_coadd_single_masked with mpix=0.

I'm not anticipating problems with those.

Let's discuss at the Tuesday data telecon about implications for Kibo. I'm not worried from a quick spot check, but I'd like to quantify that and study some rrdetails chi2 vs. z curves.

segasai commented 2 months ago

I've added the tests. Actually one of the tests did uncover the issue with the code. See b2f71bb29bb2c1d6833af89c19ece972314c4294

While adding those, I thought I'd add a test of cosmic ray masking, but an intuitive test that I wrote completely fails currently. I.e. here I put a large cosmic in i-th pixel in i-th spectrum. And I was expecting that after stacking I would get the same result as if I had one less spectrum (with all those 'cosmic-pixels removed), but this fails and the cosmic masking fails to identify the cosmics. I don't quite understand the logic of cosmic masking in coadd(), so I don't know if I'm missing something obvious or not. The cosmic test is below:

    def test_coadd_cosmic(self):
        """                                                                                                                                                               
        Test coadd with cosmic rays                                                                                                                                       
        """
        nspec, nwave = 1000, 30
        s1 = self._random_spectra(nspec, nwave)
        rng = np.random.default_rng(432)
        model0 = rng.uniform(0, 1, size=nwave)
        ivar0 = rng.uniform(size=(nspec, nwave))
        flux0 = model0 + 1./np.sqrt(ivar0) * rng.normal(size=(nspec,nwave))
        resol0 = rng.uniform(0, 1, size=s1.resolution_data['b'].shape)
        s1.flux['b'] = flux0 * 1
        s1.ivar['b'] = ivar0 * 1
        s1.resolution_data['b'] = resol0 * 1
        COSMIC_VAL = 1e9
        xgrid,ygrid =  np.meshgrid(np.arange(nwave), np.arange(nspec))
        mask = ygrid == (xgrid%nwave)
        # make i-th pixel in i-th spectrum a cosmic                                                                                                                       
        s1.flux['b'][mask] = COSMIC_VAL
        s1.fibermap['TARGETID'] = [10] * nspec
        coadd(s1, cosmics_nsig=5)

        # prepare nspec-1 spectra with those cosmic affected pixels excised.
        s2 = self._random_spectra(nspec-1, nwave)
        for i in range(nwave):
            s2.flux['b'][:, i] = flux0[~mask[:,i],i]*1
            s2.ivar['b'][:, i] = ivar0[~mask[:,i],i]*1
            s2.resolution_data['b'][:,:,i] = resol0[~mask[:,i],:,i]*1
        coadd(s2, cosmics_nsig=5)
        print(s1.flux['b'],s2.flux['b'])
        self.assertTrue(np.allclose(s1.flux['b'], s2.flux['b']))
        self.assertTrue(np.allclose(s1.ivar['b'], s2.ivar['b']))
        self.assertTrue(np.allclose(s1.resolution_data['b'],
                                    s2.resolution_data['b']))

segasai commented 2 months ago

Just a few more comments after looking at the code.

the coadd_cameras has the same issue as coadd() (it can be seen with the existing test, if I replace coadd() by coadd_cameras(). Since the logic there is a bit more complicated, the fix requires more thought. But also coadd_cameras is not really used for much I think, so it's less important.
Another thing that is a bit strange is the current behaviour when one pixel in all the spectra in the stack is masked, because the current code will then still compute the ivar, ignoring the mask https://github.com/desihub/desispec/blob/48c16414f681f34739f1d55e6f2997a4c0cc5d5f/py/desispec/coaddition.py#L571 One situation where that will produce strange results is when the same pixel was masked by cosmic logic in all exposures, then the stack will just ignore that (and it won't be reflected in the mask or ivar). I'm not quite sure it's intended behaviour (but it is also probably very rare occurrence). Also the logic here has implications for what's done with the resolution matrix for these pixels. With the new code the resolution matrix elements that have all zero weights in every exposure will be zero.

moustakas commented 2 months ago

coadd_cameras is used extensively by fastspecfit to write out coadeed spectra and models and also by the GQP working group. @stephjuneau

weaverba137 commented 2 months ago

And @adambolton I think did a full coadd across cameras for iron to load into SPARCL. This may need to be reconsidered.

stephjuneau commented 2 months ago

Hi @segasai: confirming what @moustakas and @weaverba137 mentioned above that coadd_across_cameras is very important for GQP and beyond. Any change/streamlining should at least be backward compatible to avoid breaking workflows. Thanks!

segasai commented 2 months ago

@stephjuneau I wasn't talking about breaking any interfaces here. I was just pointing that the function seem to suffer from the same issue that is topic of this PR. But indeed I wasn't aware that this f-n is widely used outside the main pipeline.

segasai commented 1 month ago

I've committed the fix to coadd_cameras as well which now passes similar tests as coadd(). From what I can see, other the resolution matrix coaddition issue, there was another issue with coadd_cameras() before

The coadd_cameras() was giving incorrect variance/fluxes if Spectra didn't have mask.
And the existing test https://github.com/desihub/desispec/blob/4eae355fbae93e5a138c6ca91074d961c2558286/py/desispec/test/test_coadd.py#L870 was previously verifying this incorrect behaviour.

sbailey commented 1 month ago

Documenting impact on redshifts based on the 59 test healpix in /dvs_ro/cfs/cdirs/desi/users/koposov/coadd_test/output compared to kibo:

74 out of 126675 targets changed between ZWARN=0 and ZWARN>0 or vice-a-versa, compared to 86.6k which were ZWARN=0 in both
For targets where both had ZWARN=0
- only one changed SPECTYPE, from GALAXY to QSO at same redshift (TARGETID= 39633145636590166 on main bright healpix 9565)
- 24 had |dv|>100 km/s
- 1 had |dv|>1000 km/s (a QSO with z= 2.846 vs. 2.820)

Note to self, scratch notebook: desi/users/sjbailey/dev/resmat_coadd/KiboImpact.ipynb

segasai commented 1 month ago

I think I implemented what was discussed today, i.e. now I believe the code will return correct results ( according to what was discussed) if a given pixel is masked in every exposure in the stack (i.e. in that case the ivar/resolution matrices will still be original, but the mask will be propagated) also I made sure the code-paths are identical in coadd(),coadd_cameras() when mask exists vs not exists to avoid possible bugs

Outstanding issues:

I believe currently (and before), if the same pixel was masked as cosmic in several exposures, the information is lost, as non-zero ivar is restored. Should I set the mask of these to COSMIC ?
The current behaviour is to combine all the masks https://github.com/desihub/desispec/blob/4eae355fbae93e5a138c6ca91074d961c2558286/py/desispec/coaddition.py#L580 But i believe if we are using an ivar weighted sum with zero weight for some exposures, there is no need to use mask from those exposures. This should 'unmask' some pixels in the coadds.
- the question of _mask_cosmics() and whether it does what's intended ( and lack of any tests)

segasai commented 1 month ago

I can now confirm two issues with the cosmic masking. Running this code, which should in theory mask ~ 2 pixels, lead to 95 pixels masked

rng = np.random.default_rng(133)
npix, nspec = 1000, 3
wave = np.arange(npix)
ivar = rng.uniform(0, 1,
                   size=(nspec, npix)) * (1 + 100 * np.arange(nspec)[:, None])
model0 = 0.01 * (wave - (wave)**2 / npix) + rng.uniform(size=npix)
flux = model0 + rng.normal(size=(nspec, npix)) / np.sqrt(ivar)
flux[2, 100] = 1e9
flux[1, 110] = -1e9
mask = np.zeros((nspec, npix), dtype=int)
ivarjj_masked = ivar * 1
cosmics_nsig = 4
desispec.coaddition._mask_cosmics(wave, flux, ivar, mask, np.arange(nspec),
                                  ivarjj_masked, 1, cosmics_nsig)

And the reason is that this line is not correct https://github.com/desihub/desispec/blob/a6a5d28190dd3d7ef05b7075d09bb677f4b5d427/py/desispec/coaddition.py#L499

here the weighted mean of the gradient in a given pixel is calculated:

https://github.com/desihub/desispec/blob/a6a5d28190dd3d7ef05b7075d09bb677f4b5d427/py/desispec/coaddition.py#L501

But it should be normalized by the ivar sum in a given pixel, not ivar sum over all the spectra/pixels. TBH I think this is potentially more dangerous issue than the resolution one, although I can't quite figure out in what circumstances it is more likely to happen -- I think the case of vastly different S/Ns of spectra in the stack is more problematic.

A smaller minor issue is that the existing code will always mask a cosmic and pixel to the right. I am not sure it's intentional. If intentional, one would probably mask pixel on the left and right of the cosmic.

segasai commented 1 month ago

(I will continue posting here to avoid forgetting/for the record)

With the cosmic ray fix, a noticeably smaller number of pixels is masked. I.e. before (main/bright/10032) :

INFO:coaddition.py:507:_mask_cosmics: masking 5 values for targetid=39633237068220964
INFO:coaddition.py:507:_mask_cosmics: masking 2 values for targetid=39633241010866687
INFO:coaddition.py:507:_mask_cosmics: masking 78 values for targetid=39633233100410381
INFO:coaddition.py:507:_mask_cosmics: masking 1 values for targetid=39633237068218724
INFO:coaddition.py:507:_mask_cosmics: masking 1 values for targetid=39633237068222675
INFO:coaddition.py:507:_mask_cosmics: masking 4 values for targetid=39633233104601226
INFO:coaddition.py:507:_mask_cosmics: masking 2 values for targetid=39633237072413699
INFO:coaddition.py:507:_mask_cosmics: masking 3 values for targetid=39633229124207324
INFO:coaddition.py:507:_mask_cosmics: masking 42 values for targetid=39633241006672092
INFO:coaddition.py:507:_mask_cosmics: masking 5 values for targetid=39633241010864364
INFO:coaddition.py:507:_mask_cosmics: masking 26 values for targetid=39633241010864979
INFO:coaddition.py:507:_mask_cosmics: masking 6 values for targetid=39633237068220964
INFO:coaddition.py:507:_mask_cosmics: masking 3 values for targetid=39633241010866687
INFO:coaddition.py:507:_mask_cosmics: masking 17 values for targetid=39633233100410381

After

INFO:coaddition.py:511:_mask_cosmics: masking 2 values for targetid=39633241010864979
INFO:coaddition.py:511:_mask_cosmics: masking 1 values for targetid=39633233100410381
INFO:coaddition.py:511:_mask_cosmics: masking 2 values for targetid=39633237072413699
INFO:coaddition.py:511:_mask_cosmics: masking 1 values for targetid=39633241006672092
INFO:coaddition.py:511:_mask_cosmics: masking 2 values for targetid=39633241010864979
INFO:coaddition.py:511:_mask_cosmics: masking 2 values for targetid=39633233100410381
INFO:coaddition.py:511:_mask_cosmics: masking 1 values for targetid=39633241006672092

which brings another question of existing criterion for masking. Currently it's:

$$\chi^2/(nspec-1) > nsig^2$$ (where nsig is cosmic_nsig=4) I am not quite sure that makes sense to me, given if anything chi^2/(nspec-1) will have an asymptotic distribution of N(1, sqrt(2/(nspec-1))). Hence the masking is not aggressive enough and becomes even less aggressive for large nspec (and also does not correspond to 4 sigma )

IMO the right behaviour is something like

chi2 = np.sum(gradivar * deltagrad**2, axis=0)
cosmic_bad  = (chi2 > scipy.stats.chi2(nspec-1).ppf(scipy.stats.norm.cdf(cosmic_nsig))

Doing that will lower the threshold for masking from what it is now (possibly giving more similar number of masked pixels to what was before). It would be great to hear opinions on whether people think it is the right way to go, given this will definitely affect pixels and variances.

(Update:) Also the current code will never mask more than one cosmic per pixel https://github.com/desihub/desispec/blob/4eae355fbae93e5a138c6ca91074d961c2558286/py/desispec/coaddition.py#L559 , which is not correct I believe.

sbailey commented 1 month ago

My latest commits include two tests for issues that @segasai identified that we need to fix:

if a pixel is masked in every exposure, that should be propagated to the coadd even if it is masked for different reasons in each exposure (already fixed; @segasai coded the fix faster than I could code the test)
if two exposures have unlucky cosmics at the same wavelength, they should both be masked as long as there are N>2 additional exposures that can identify that it really is a cosmic outlier and not a narrow emission line (current code just masks the first one, not the second)

segasai commented 1 month ago

I've committed the masker that can deal with more than one cosmic per pixel. Currently it's doing essentially iterative sigma clipping. That passes the tests, but my only worry are variable objects. With the new code we continue clipping till the chi^2 becomes low enough or we have less than 3 pixels in the stack. One could potentially stop earlier and not do more than 2 iterations of clipping. Other than that, I think we dealt with every outstanding issue here.

UPDATE on variable objects -- looking at sv1/dark/5059 RRL with targetid=39628511287183022 with 30 observations: INFO:coaddition.py:571:_mask_cosmics: masking 47 wavelengths in 30 spectra for targetid=39628511287183022 INFO:coaddition.py:574:_mask_cosmics: masking 6 wavelengths with more than 1 mask per pixel for targetid=39628511287183022 INFO:coaddition.py:571:_mask_cosmics: masking 38 wavelengths in 30 spectra for targetid=39628511287183022 INFO:coaddition.py:574:_mask_cosmics: masking 8 wavelengths with more than 1 mask per pixel for targetid=39628511287183022 INFO:coaddition.py:574:_mask_cosmics: masking 14 wavelengths with more than 1 mask per pixel for targetid=39628511287183022 It doesn't seem too bad, so maybe things can be kept as they are now.

segasai commented 1 month ago

The impact of the cosmic ray fixes is particularly strong on the spectra with sharp gradients. Here's comparison of old and new stacks. The old stacks tended to mark some of the edges as cosmics:

I assume the same thing is present with narrow emission lines

segasai commented 1 month ago

There was one more bug in the original implementation (and the new one) is that if there was ever a fully blank spectrum in the stack with ivar=0, then the cosmic masking of any spectrum after that was incorrect, because of this skipping: https://github.com/desihub/desispec/blob/4eae355fbae93e5a138c6ca91074d961c2558286/py/desispec/coaddition.py#L523 When spectra are collected for masking, the ivar=0 ones are skipped, but when doing the masking, that 'off-by-N' https://github.com/desihub/desispec/blob/4eae355fbae93e5a138c6ca91074d961c2558286/py/desispec/coaddition.py#L561 is ignored leading to masking of wrong pixels. Encountered when running on sv1/dark/5059 The last commit fixes that and adds a test.

segasai commented 1 month ago

Pixel differences for a Lya-forest object, showing impact of the cosmic masking there. In basically all the pixels with non-zero difference bertween new/old the masking there was happening in the previous iteration of the code and does not happen now.

desihub / desispec

Fix spectra coaddition and cosmic masking #2377