stscijgbot-jp commented 3 months ago

Issue JP-3639 was created on JIRA by David Law:

Various other tickets exist describing the need for 1/f correction, e.g., https://jira.stsci.edu/browse/JP-2576 discussing the rationale for doing so. Likewise, https://outerspace.stsci.edu/pages/viewpage.action?spaceKey=JWSTCC&title=JWST+Pipeline+1-over-f+Noise+Removal contains notes from various meetings discussing potential algorithms.

This ticket focuses on establishing the detailed method of implementation in the pipeline, and is separate to more clearly distinguish discussion such details from discussion of the core algorithms wrapped by the step.

The new '1/f correction' step in the pipeline should be immediately prior to the current ramp_fitting step in calwebb_detector1. The reason for this is that it is best to correct 1/f noise at the group stage in order to reliably correct the 1/f signal both in blank regions and underneath sources.

This 1/f correction would be done using the source-free science pixels in the detector using some TBD algorithm. Regardless of the details of this algorithm though, it will be necessary to use a mask to mask out sources in the scene, measure the 1/f signal, and apply the correction to the group stage data prior to ramp fitting. This step should be capable of using such a masked scene if passed in to the step, and should otherwise attempt to compute such a mask on the fly. In order to do so, the 1/f step would need to call ramp_fitting to produce a 'preliminary' rate image with which to do the masking. As such ramp fitting could strictly be done twice- once within the 1/f step to generate the on-the-fly mask, and then again in the regular ramp_fitting stage using the 1/f corrected ramp. If doing so, the ramp fitting call within the 1/f step should probably use the C-based implementation (or a comparably fast one) by default.

Note that I've identified all four instruments in this ticket. For NIRCam and NIRISS, this is because the work is most relevant to them. NIRSpec already has a solution in NSClean, but the proposal here is to move that step into the infrastructure proposed here to have a single location for 1/f correction in the pipeline. MIRI does not experience 1/f noise, but nonetheless has vertical striping with some similarities to 1/f noise (resulting from variations in the detector dark) that may be possible to correct in a similar manner and place in the pipeline.

Additional details:

This step should be inserted between the 'jump' and 'ramp_fitting' steps. It should be set to skip by default for the time being (and maybe permanently, TBD).
This step should read in a single group-level file and an optional 2d mask with x/y pixel size the same as the input group-data file, and write out modified group-level data. No reference files for this step at the current time.
Within the code it should look to see if an input 2d mask was provided, and if not calculate one from the group data by running it through a temporary ramp fitting (using the C-based implementation) and applying some TBD algorithm to derive the mask. The 1/f tiger team will fill in details on this masking algorithm in the next 1-2 weeks.
Once a mask exists, the group-stage data should be corrected using a choice of two possible algorithms. The first is the NSClean algorithm, already implemented for rate-stage data, and the second is Chris Willott's 1/f removal algorithm ([https://jwst-docs.stsci.edu/files/255987666/255987671/1/1710512651770/example_nrc_image1overf_5.ipynb]). Note that this will need to be tweaked slightly to not subtract off the overall increasing baseline counts from the background signal, only the differential counts across the image. The 1/f team will weigh in on details in the next 1-2 weeks.

stscijgbot-jp commented 2 months ago

Comment by David Law on JIRA:

I've attached a few things to this ticket:

Kevin Volk 's PPT overview of how the background-selection function works for determining a scene mask
group_1overf.ipynb: This is a notebook that I put together incorporating both Kevin's algorithm for scene masking and Chris Willott's algorithm for 1/f correction, adapted to being applied at the group level. Input is a jump file coming out of the JUMP step, and the output is a similar file ready to go into ramp fitting.
Two demo examples of how this notebook performs on some test cases (empty field jw01345001001_10201_00001_nrca3_uncal and crowded field jw01074001001_02101_00001_nrca1_uncal)

stscijgbot-jp commented 2 months ago

Comment by Tyler Pauly on JIRA:

The mask in the powerpoint appears to be excluding some 1/f striping, though I checked out your notebook and found the crowded field mask looks fairly clean/robust. Should we expect a fine-tuning of the cutoff sigma by exposure/exposure type/instrument, or should a more robust cutoff/selection algorithm be considered? Or is this behavior a best reasonable effort for now?

stscijgbot-jp commented 2 months ago

Comment by David Law on JIRA:

I'd just expose the various sigma cuts as parameters for now (1 in the initial mask creation, and 2 in the Willott 1/f routine itself) so that we can try tweaking them. I think we should move ahead with this algorithm as 'good enough' to start with, and any more robust algorithms are something that can be considered in future.

stscijgbot-jp commented 1 month ago

spacetelescope / jwst

Implementation of 1/f correction #8517

If the input is ramp data, make a draft rate (single_mask=True) or rateints (single_mask=False) file.

Create a scene mask from the rate data:

If mask_spectral_region is set and the input is NIRSpec data, run assign_wcs and msaflagopen on the rate data if needed, then mask any known science areas or failed-open MSA shutters.

Iteratively sigma clip the data to get a center value (mean or median) and sigma.

If fit_histogram is set, compute a histogram from 4-sigma clipped values and fit a Gaussian to it to refine the center and sigma values.

Mask data less than 3 * sigma from the center as bad values.

Mask data more than n_sigma * sigma from the center as signal (not background).

Iterate over each integration and group in the data:

For ramp data, make a diff image (current group – previous group) to correct. For rate data, the image is directly corrected.

Fit and remove a background level, using the scene mask to identify background pixels.

If background_method = None, no background is removed (default for nsclean).

If background_method = 'median', the background data are further clipped, and the background value is a simple median of the remaining values.

If background_method = 'model', the background data are also further clipped, and the background value is a 2D image, from a median filter with a 5x5 kernel, fit to the remaining values.

For either 'median' or 'model' methods, the background subtracted data are clipped again to n_sigma * sigma, with sigma recomputed from the remaining background pixels.

Fit and remove the 1/f noise in the background subtracted image.

If fit_method = 'fft' and the input is NIRSpec, the nsclean library is called to fit and remove the noise in frequency space.

If fit_method = 'median', the noise is fit with a simple median along the detector slowaxis.

Restore the background level to the cleaned, background-subtracted image.

For ramp data, add the cleaned diff back to a cleaned version of the previous group.

I think it's worth having NSClean and the median values for all arrays. Using NSClean should be very straightforward with the new subarray version: it simply takes pixel values and a mask and spits out corrected values.

A question: is the model background computed on the rate images or on the group data? I think it should be much better to do this on the provisional rate images, where you can get good S/N estimates of the background. You may well be doing it on the rate images here, but I wasn't sure.