desihub / desispec

DESI spectral pipeline
BSD 3-Clause "New" or "Revised" License
36 stars 24 forks source link

Add row-by-row extraction and CTE correction modules #2144

Closed schlafly closed 8 months ago

schlafly commented 9 months ago

This PR adds two components to the pipeline

These are connected because the CTE correction routine uses the row-by-row extraction to model images. spectroperfectionism would be better but is slower (?); this should occur in a loop ~5-10 times for each CTE-affected frame and so speed was desirable. I also experimented with the simpler aperture qproc.extract routine, which was plenty fast but which did a poor job of eliminating light from neighboring spectra from contaminating target spectra. This was particularly a problem for bright tiles where we want good sky spectra (faint) but the target spectra are rather bright. We ended up on row-by-row extraction as a compromise.

The main CTE correction algorithm looks like this:

The real challenges here are getting a good model of what CTE does to an image, and correcting for it in a way that doesn't correlate the read noise. With Nicolas Regnault's help we have a pretty decent model (though we could study this more...), and the row-by-row extraction modeling handles the second part well enough.

Here are some QA comparisons. The following PDFs have interleaved daily & this-PR-on-z1-only reductions of a lot of exposures from 20230825 - 20230906. The first in each pair of pages is daily, the second is the new reduction. https://data.desi.lbl.gov/desi/users/schlafly/misc/schlafly-cte-5-tileqa.pdf https://data.desi.lbl.gov/desi/users/schlafly/misc/schlafly-cte-5-skyzfiber.pdf To drill down more, look at the nightqa in https://data.desi.lbl.gov/desi/spectro/redux/schlafly-cte-5/nightqa/ But here's at least one comparison plot to show that when things are bad, the new code makes a big improvement: Daily: image New: image (only z1 has the new code active)

And finally, one plot to show that we're not increasing the noise. image This just gives the median absolute deviation converted to a standard deviation of the sky fibers in all of the exposures I've run, as a function of exposure index, for z1, only in the CTE affected region. The new code is slightly better than the old code. I think this is understating the improvement---most of the issues in daily are on sky lines or individual fibers, which the clipping is doing a good job of removing---but it at least says that things aren't worse.

The optimal row-by-row extraction routine is intended to be the usual thing:

Maybe a few things worth discussing in the extraction:

The CTE correction module does a few different CTE-related things.

Applying CTE to images:

Correcting CTE:

Fitting CTE parameters:

Metadata:

schlafly commented 9 months ago

Maybe one more bit of code that could be useful but isn't in the PR proper... this is the hack I used to call my CTE correction code on existing preproc files.

I don't think we really want this---we should build it into the preproc code---but this is the easiest way to run the current code on an existing preproc image.

def process(x):
    im = desispec.io.read_image('/global/homes/s/schlafly/desiproject/spectro/redux/daily/preproc/'+x)
    if im.meta['OBSTYPE'] != 'SCIENCE':
        return
    fmap = read_fibermap('/global/homes/s/schlafly/desiproject/spectro/redux/daily/preproc/'+x)
    desispec.io.util.addkeys(fmap.meta, im.meta, skipkeys=('BZERO', 'BSCALE'))
    im.fibermap = fmap
    newim = correct_cte.correct_image_via_model(im, niter=5)
    desispec.io.write_image(x, newim)
schlafly commented 9 months ago

Here are a couple of example images showing how the trap parameters have evolved with time, on z1 and z3.

Note that when the trap amplitudes are small (e.g., less than ~50), the existing corrections are adequate and we probably don't need to turn on this special mode. Moreover, the uncertainty in the leakiness isn't a big deal as the amplitude gets lower, since the trap is just not trapping as many electrons and the correction becomes quite small.

image image

Stephen expressed some concern about the single point on z1 with highly outlying amplitude of ~40. I haven't looked at that one case, but my sense is that I'm amazed at how well this worked with the simple scheme of just grabbing one 1 s flat and the previous image for each night. As a reminder, a lot can go wrong with the cals. It may make sense to add the fit parameters to nightly QA or compare them to the previous night's values or something to help us identify when something has gone wrong with the nightly cals. I don't know what the usual mechanisms there are.

There is also room to be concerned about the higher chi^2 / dof on pre-wildfire nights on z1. The suggestion is that the CTE looked different then and the existing model can't fit it well. I don't think we should apply the correction to that data and can certainly imagine that we'll need to develop new prescriptions in the future to address them. But the existing large CTE effects on z1, z3, r6 that are poorly corrected are all well modeled by this parameterization.

schlafly commented 9 months ago

Stephen asked me to diagnose the one night where z1 shows dramatically lower CTE than usual, night index ~ 550 or so. It turns out that that night is 20230822. This was the night where we experimented with different voltage settings on z1. We ended up not keeping the new voltages because they increased the read noise, but they did dramatically improve the CTE. i.e., this turns out not to be an actual outlier but in fact a reflection of an actual change in the CTE in that exposure.

As part of diagnosing this I computed appropriate CTE model parameters for all of the traps I know about through today and stored them here: https://data.desi.lbl.gov/desi/users/schlafly/cteparams/cteparamsall.fits These are pretty fast to compute; doing all nights from 2022/1 through today took O(15 min) on one interactive node. There's a corresponding PDF here: https://data.desi.lbl.gov/desi/users/schlafly/cteparams/cteparamtrends.pdf

It looks like at least one of the outliers on z5 is due to a particularly vertical CR. I should be able to do better with the masking there.

julienguy commented 8 months ago

Test report: I ran the CTE parameter fit on the night 20231111 for z1 and z3.

export SPECPROD=daily

then in python

from desispec.correct_cte import fit_cte_night
fit_cte_night(20231111,"z1")
# {'C': {'543:2057': (array([139.49198384,   0.20785951]), 0.5409279496512583)}}
fit_cte_night(20231111,"z3")
# {'C': {'1668:2057': (array([91.33659997,  0.22315425]), 1.0845442585069123)}}

I then edited the library code to add those parameters in order to run the code. Then wrote a script with the following lines and ran it for exposure NIGHT=20231111 EXPID=00204371 which is a 1s LED exposure.

    im = read_image(args.infile)
    fmap = read_fibermap(args.infile)
    addkeys(fmap.meta, im.meta, skipkeys=('BZERO', 'BSCALE'))
    im.fibermap = fmap
    newim = correct_image_via_model(im, niter=args.niter)
    write_image(args.outfile, newim)

z1 before: preproc-z1-00204371-before z1 after: preproc-z1-00204371-after z3 before: preproc-z3-00204371-before z3 after: preproc-z3-00204371-after

The correction is almost perfect for z1 and pretty good for z3. We may need to add more calibration data for the modeling.

schlafly commented 8 months ago

I have added some simple unit tests; take a look. They were a bit harder to make than I wanted because it doesn't look like it's easy to make a simple specter PSF object, and so I used the test PSF object that's also used in specter. And that object is rather more detailed than I had wanted in my unit tests. I would have been happy with straight traces and sum(profile) = 1 and no negative pixels in the profile, for example! Those details make sanity checks less easy.

Of particular note is that the 1D profile in specter contains "tails" which are usually zero in real DESI data. I spent a while not realizing that my model & extraction were inconsistent because I had used tails = True (the default) in onedprofile_gh and tails = False (the default) in extract, and so I was failing these tests and trying to figure out how I had screwed up so badly! This doesn't really matter in real data since the tails are zero, but not in the test data. I don't think there are any real bugs here. It would be nice if the defaults for extract(...) and onedprofile_gh(...) were the same---that would certainly minimize surprise---but on the other hand it feels like the default for onedprofile should be to do everything and for extract(...) should be to do the efficient thing, so I haven't changed the behavior here.

That said, here's what the checks do:

Does that sound adequate?

julienguy commented 8 months ago

The test looks great. I would suggest to encapsulate the functions into a class inheriting from unittest.TestCase , and a convenient function

if __name__ == '__main__':
    unittest.main()      

like is done for instance in desispec/test/test_coadd.py

Please also remove the hardcoded set of CTE params from the code and add them as an argument to the main function correct_image_via_model. I will then merge the PR and start from there to integrate this code into the pipeline processing.

weaverba137 commented 8 months ago

@julienguy, @schlafly, I'm actually going suggest the opposite: do not add the extra if __name__... boilerplate to the test. That is a leftover from the days of python setup.py test. The current best practice would be to run:

pytest py/desispec/test/test_rowbyrowextract.py

You can still optionally wrap the tests in unittest.TestCase though for consistency with other tests in the package.

schlafly commented 8 months ago

In my own testing I ran just this module with Ben's invocation. The unit tests are getting run as I've written it. https://github.com/desihub/desispec/actions/runs/7463894705/job/20309610108?pr=2144#step:5:50 i.e., I can wrap it for consistency but it's not clear to me that that buys anything.

I have a version locally that I haven't pushed yet that allows the arguments to be passed by hand but falls back to the dictionary, which we can delete later. If I totally delete it I won't be able to run nights anymore.

schlafly commented 8 months ago

I went ahead and removed the hard-coded dictionary; I had forgotten that my current mechanism for running nights involved hacking the preproc files, so my objection wasn't relevant.

julienguy commented 8 months ago

OK, thanks Ben. pytest py/desispec/test/test_rowbyrowextract.py is working fine. I see the hardcoded params are removed. I am going to merge after sorting out the documentation error.

sbailey commented 8 months ago

I think the failing documentation test is because the new modules haven't been added to doc/api.rst . I think the procedure is to run the following from the top-level desispec directory:

desi_api_file --overwrite

and then commit the updated doc/api.rst file.