spacetelescope / jwst

Python library for science observations from the James Webb Space Telescope
https://jwst-pipeline.readthedocs.io/en/latest/
Other
558 stars 164 forks source link

Implementation of 1/f correction #8517

Closed stscijgbot-jp closed 3 hours ago

stscijgbot-jp commented 3 months ago

Issue JP-3639 was created on JIRA by David Law:

Various other tickets exist describing the need for 1/f correction, e.g., https://jira.stsci.edu/browse/JP-2576 discussing the rationale for doing so.  Likewise, https://outerspace.stsci.edu/pages/viewpage.action?spaceKey=JWSTCC&title=JWST+Pipeline+1-over-f+Noise+Removal contains notes from various meetings discussing potential algorithms.

This ticket focuses on establishing the detailed method of implementation in the pipeline, and is separate to more clearly distinguish discussion such details from discussion of the core algorithms wrapped by the step.

The new '1/f correction' step in the pipeline should be immediately prior to the current ramp_fitting step in calwebb_detector1.  The reason for this is that it is best to correct 1/f noise at the group stage in order to reliably correct the 1/f signal both in blank regions and underneath sources.

This 1/f correction would be done using the source-free science pixels in the detector using some TBD algorithm.  Regardless of the details of this algorithm though, it will be necessary to use a mask to mask out sources in the scene, measure the 1/f signal, and apply the correction to the group stage data prior to ramp fitting.  This step should be capable of using such a masked scene if passed in to the step, and should otherwise attempt to compute such a mask on the fly.  In order to do so, the 1/f step would need to call ramp_fitting to produce a 'preliminary' rate image with which to do the masking.  As such ramp fitting could strictly be done twice- once within the 1/f step to generate the on-the-fly mask, and then again in the regular ramp_fitting stage using the 1/f corrected ramp.  If doing so, the ramp fitting call within the 1/f step should probably use the C-based implementation (or a comparably fast one) by default.

Note that I've identified all four instruments in this ticket.  For NIRCam and NIRISS, this is because the work is most relevant to them.  NIRSpec already has a solution in NSClean, but the proposal here is to move that step into the infrastructure proposed here to have a single location for 1/f correction in the pipeline.  MIRI does not experience 1/f noise, but nonetheless has vertical striping with some similarities to 1/f noise (resulting from variations in the detector dark) that may be possible to correct in a similar manner and place in the pipeline.

Additional details:

stscijgbot-jp commented 2 months ago

Comment by David Law on JIRA:

I've attached a few things to this ticket:

stscijgbot-jp commented 2 months ago

Comment by Tyler Pauly on JIRA:

The mask in the powerpoint appears to be excluding some 1/f striping, though I checked out your notebook and found the crowded field mask looks fairly clean/robust. Should we expect a fine-tuning of the cutoff sigma by exposure/exposure type/instrument, or should a more robust cutoff/selection algorithm be considered? Or is this behavior a best reasonable effort for now?

stscijgbot-jp commented 2 months ago

Comment by David Law on JIRA:

I'd just expose the various sigma cuts as parameters for now (1 in the initial mask creation, and 2 in the Willott 1/f routine itself) so that we can try tweaking them.  I think we should move ahead with this algorithm as 'good enough' to start with, and any more robust algorithms are something that can be considered in future.

stscijgbot-jp commented 1 month ago

Comment by Melanie Clarke on JIRA:

I have a first working draft of the new step here: #8669

The new step is called clean_noise, but I can change that if some other name is preferred.

For the framework and integration, I moved the nsclean calling structure and library over to the clean_noise module and aliased the existing nsclean step (called in spec2) to the clean_noise implementation instead.  As implemented, the step can be run on either ramp or rate data.  The nsclean and clean_noise steps currently have different defaults: I have left nsclean defaults with equivalents to the current values; I set clean_noise defaults to more generally useful/safe values across all instruments.

The basic process is now:

If the input is ramp data, make a draft rate (single_mask=True) or rateints (single_mask=False) file.

Create a scene mask from the rate data:

If mask_spectral_region is set and the input is NIRSpec data, run assign_wcs and msaflagopen on the rate data if needed, then mask any known science areas or failed-open MSA shutters.

Iteratively sigma clip the data to get a center value (mean or median) and sigma.

If fit_histogram is set, compute a histogram from 4-sigma clipped values and fit a Gaussian to it to refine the center and sigma values.

Mask data less than 3 * sigma from the center as bad values.

Mask data more than n_sigma * sigma from the center as signal (not background).

Iterate over each integration and group in the data:

For ramp data, make a diff image (current group – previous group) to correct. For rate data, the image is directly corrected.

Fit and remove a background level, using the scene mask to identify background pixels.

If background_method = None, no background is removed (default for nsclean).

If background_method = 'median', the background data are further clipped, and the background value is a simple median of the remaining values.

If background_method = 'model', the background data are also further clipped, and the background value is a 2D image, from a median filter with a 5x5 kernel, fit to the remaining values.

For either 'median' or 'model' methods, the background subtracted data are clipped again to n_sigma * sigma, with sigma recomputed from the remaining background pixels.

Fit and remove the 1/f noise in the background subtracted image.

If fit_method = 'fft' and the input is NIRSpec, the nsclean library is called to fit and remove the noise in frequency space.

If fit_method = 'median', the noise is fit with a simple median along the detector slowaxis.

Restore the background level to the cleaned, background-subtracted image.

For ramp data, add the cleaned diff back to a cleaned version of the previous group.

I've tested this implementation on the examples in David's notebook, with similar results, and on a few other examples from the 1/f outerspace page.  I set the defaults for all parameters to values that look reasonable to me across most of these examples, although large diffuse fields will likely need background_method = model. 

I'll start work now on developing some unit and regression tests for the new step, but it would be helpful to have feedback now on the basic process, the parameters I've chosen to expose, their default values, and if there are any more options you'd like me to implement.

I'd also like to know if we want to add a clean_noise step to spec2 and/or image2 for non-NIRSpec instruments.  In testing, cleaning the rate image directly looks to my eye like it is often as effective as cleaning the ramp data, and it might be nice for users to have an option that doesn't require re-running detector1.

stscijgbot-jp commented 1 month ago

Comment by David Law on JIRA:

Great, thanks Melanie Clarke !  I'll take a look.

stscijgbot-jp commented 1 month ago

Comment by Melanie Clarke on JIRA:

Thanks David Law!  Do you have any MIRI data that shows its version of vertical striping?  I have not yet checked to see if the new step works for MIRI.

stscijgbot-jp commented 1 month ago

Comment by David Law on JIRA:

Melanie Clarke Interesting question; if we can address MIRI as well that'll be a win, but the MIRI situation is quite different.  In this case the vertical stripes aren't from 1/f noise, but from temporal variation in the dark current.

I just tested the step quickly on MIRI imaging data jw01536016001_02101_00001_mirimage and jw01536016001_02103_00001_mirimage and there are a few things we'd need to do differently.  First, the routine would need to be applied to the other axis; i.e., slowaxis=2 but the routine should be passed slowaxis=1 to get it right (easy hack).  More problematically, the MIRI imager has a metering structure covering a large part of the detector area, along with coronagraphs that let a different amount of light through, and a knife-edge structure that sticks out into the main imaging area.  We probably wouldn't want to apply any destriping in the coronagraph region at all, and in the main imaging region we'd need to mask out the knife edge so that it doesn't bias the statistics.  At the moment that isn't masked in any obvious way; potentially the DQ mask could be changed, although there's no obvious bit name that's applicable.  It may also be possible to just hard-code applying the algorithm to column 360 and above, and hard-coding three rectangular regions to ignore.  Clunky, but possibly effective.

For the MIRI MRS detectors though I don't think a routine like this will work- the striping runs along the same axis as the dispersion, meaning all of the IFU slices will look like high-valued columns and will have weird effects around the curvature of the trace.

stscijgbot-jp commented 1 month ago

Comment by Melanie Clarke on JIRA:

Interesting, thanks.  I'll take a look and see if there are easy ways we could include some of the MIRI modes. If not, I'll make sure they are explicitly excluded.

stscijgbot-jp commented 1 month ago

Comment by Timothy Brandt on JIRA:

Thanks, Melanie Clarke.  The basic approach looks generally good as you describe it; I will take a more detailed look at the code.  A few comments for now:

For the background, I would favor a significantly larger filter.  Median filters are expensive, so a convolution probably isn't a good idea.  Here is a nearly equivalent approach for, e.g., a 32x32 filter: apply a nanmedian or equivalent in adjacent but disjoint 32x32 blocks, and then interpolate within the results: pixel (32, 32) will be at the corner of four of these, so it will get equal weights of four medians, while pixels closer to (0, 0) than (16, 16) will have the value of the lower-left box median.  This should be cheaper computationally than the current implementation (since the cost of the median is order n ), and should avoid having the background model incorporate too much noise or 1/f signal.  The size of this region could be a tunable parameter.

I think it's worth having NSClean and the median values for all arrays.  Using NSClean should be very straightforward with the new subarray version: it simply takes pixel values and a mask and spits out corrected values.  

For fitting at group vs. ramp stage; doing this at ramp stage will generally look good.  However, the weights of each group as they contribute to the final ramp will depend on the flux in a given pixel.  A fit to 1/f that is good for low-flux pixels is, almost by definition, not so good for intermediate flux pixels (jumps also mess things up).  This introduces noise to the image that will be invisible if looking at the background pixels, which is an insidious kind of noise.  That is the reason to only support this at the group stage.

A question: is the model background computed on the rate images or on the group data?  I think it should be much better to do this on the provisional rate images, where you can get good S/N estimates of the background.  You may well be doing it on the rate images here, but I wasn't sure.

stscijgbot-jp commented 1 month ago

Comment by Melanie Clarke on JIRA:

Thanks for reviewing, Timothy Brandt

For 1 - I'm not sure I understand your suggestion.  Can you mock up some pseudocode or point me to an implementation you like?

For 2 – you mean offer NSClean-style FFT noise fitting for the other instruments? I can look into what would be needed for that.

For 4 – the scene mask is computed on the rate image.  The model background is fit to the group diffs, to remove any overall background level in the diff before fitting the residual noise.  It's not clear to me how a background computed from the rate image could be applied to correct a group diff?

stscijgbot-jp commented 1 month ago

Comment by Melanie Clarke on JIRA:

Allowing NSClean for NIRCam is indeed pretty easy with your changes!  I just made a small modification not to flip the array for detector orientation sent the diff image to the subarray implementation.

I've attached a quick image of what that looks like with example file jw01345001001_10201_00001_nrca3, which is the best case scenario of a sparse field and plenty of background.  With default frequency tuning parameters, NSClean leaves the same kind of artifacts around bright sources that we see when applying NSClean to NIRSpec MOS data: high frequency noise over the heavily masked region, where there isn't enough background data to get a good fit.  It also adds dark halos around the bright sources, presumably because the source flux is also not masked enough.

stscijgbot-jp commented 1 month ago

Comment by Timothy Brandt on JIRA:

Melanie Clarke I am attaching a notebook (see the updated version) with a demonstration of my suggestion for fitting the background.  I think we should fit the background from the rate image and then just scale to the group difference by the time difference between the groups, unless the background really does fluctuate a lot on the group-to-group time scale.  I also put in a suggestion on expanding the mask, which I think would help a lot with the under flagging of the scene in the vicinity of bright objects.  Let me know if you would like to discuss any of this.

stscijgbot-jp commented 1 month ago

Comment by Timothy Brandt on JIRA:

One more thing while I think of it.  I couldn't tell at a glance from the code, but when you feed NIRCam data into NSClean, are you using it in full array mode or subarray mode?  I am looking at the routine now and need to ask Bernie how it works with the four reference channels.  It seems to me like the code assumes a single reference output, which can't be right.

stscijgbot-jp commented 1 month ago

Comment by Timothy Brandt on JIRA:

Update from Bernie: the current version of NSClean for full frame arrays is a bit of a hastily-written compromise and can probably be improved; the implementation of the Fourier vectors isn't really that well motivated for how it's implemented.  I'll play around with that over the coming days.  The implementation for subarrays is more theoretically motivated so I will look primarily at the full-frame version.

stscijgbot-jp commented 1 month ago

Comment by Melanie Clarke on JIRA:

Excellent, thanks for the notebook.  I'll take a look.

In general, NSClean has an even harder time with over-masked regions than under-masked ones, so expanding the mask for NIRCam images would likely help with the dark halo issue, but would make the high-frequency scratches over the masked region worse.  I did check what it looked like if I reduced the n_sigma, to flag more signal, and saw exactly that: halos are better, scratches are worse.

For the NSCleaned NIRCam data, I fed it to the subarray implementation, temporarily borrowed from the version on your updated branch.  

It would be nice to unify the full frame and subarray handling if we can!

stscijgbot-jp commented 1 month ago

Comment by Timothy Brandt on JIRA:

Ok, thanks.  NSClean in subarray mode should really operate channel-by-channel, with alternating channels fed in in reverse order.  Then there is the issue with Bernie for full frame images.  I will keep myself busy trying to iron these things out.  

stscijgbot-jp commented 1 month ago

Comment by David Law on JIRA:

Looking good to me- out of the box it seems to be working pretty well on many different input data sets, and those for which it has trouble can be improved using the optional parameters.  Individual instrument teams will likely end up with preferred settings, which can be provided via param ref files.  A few thoughts:

I'd be a little leery of estimating the background for a given group from the rate file, given the number of different read patterns with group averages, dropped frames, etc.  Estimating a background from the group differences will be noisier, but avoid the risk of systematic issues that might only show up for certain instruments/modes/read patterns.  I know exact timing of all of these patterns has been a continuing source of issues and discussion.  Thoughts Timothy Brandt ?

stscijgbot-jp commented 1 month ago

Comment by Melanie Clarke on JIRA:

Thanks for testing, David. A couple quick responses...

For the log issues - I pushed an update yesterday that should address that.  I hid the parameter logging and added messages about calling the other steps.  Let me know if you want further revisions there.

For the name, is the vertical MIRI striping also appropriately called f-noise?  I haven't yet had a chance to look into supporting it, but if we want to add that here, we should make sure the name covers it.

stscijgbot-jp commented 1 month ago

Comment by Timothy Brandt on JIRA:

David Law I'm a bit confused: if the background is not steady, then the basic assumption inherent in constructing a rate file is broken, and using the optimal up-the-ramp weights is a bad idea.  If the background is steady, then we can just scale the background from the rate file by the difference between the mean group times; this is essentially what the ramp fitting step does.  

stscijgbot-jp commented 1 month ago

Comment by David Law on JIRA:

Melanie Clarke I've been going back and forth on your second point in my own mind as well.  The MIRI striping looks like 1/f noise, but is a physically different phenomenon.  After bouncing it off Michael Regan , another possibility would be to call it 'clean_flicker'.  Generally flicker noise and 1/f noise are the same thing, but that way we avoid the specific term and history that goes with it.

If we do correct the MIRI data, we'd want to only apply it when EXPTYPE='MIR_IMAGE'.  If in subarray imaging mode it may work fairly easily, but for SUBARRAY=FULL we'd have to introduce some kind of logic to deal with the metering structure.

stscijgbot-jp commented 1 month ago

Comment by David Law on JIRA:

Timothy Brandt I'm also assuming that the background is steady, my concern is just that we could end up with a multiplicative offset with how we're scaling it because of differences in how various readout patterns are defined.  I know we've had some grief defining all of the related ways of measuring time (see, e.g., https://jwst-docs.stsci.edu/accessing-jwst-data/jwst-science-data-overview/jwst-time-definitions) so I'm a little wary of accidentally introducing some systematic that would only become obvious in some cases.  So long as we make sure to test that it's working as expected for a variety of different patterns I think we'd be ok.

stscijgbot-jp commented 1 month ago

Comment by Melanie Clarke on JIRA:

I like clean_flicker_noise, for what it's worth.

I can implement both background correction schemes and make it a user option, so we can test the impact on the data.

stscijgbot-jp commented 1 month ago

Comment by David Law on JIRA:

Let's go with clean_flicker_noise then.

stscijgbot-jp commented 1 month ago

Comment by Timothy Brandt on JIRA:

An update: I will be making significant modifications to the full frame version of NSClean, hopefully today.  I think it will be an improvement for all use cases, though it will probably be somewhat slower than it is now (by a factor of a few, but still faster than it was a month ago).  I would like someone in the NIRSPEC team to review these proposed changes when they are ready, since they will significantly modify the current behavior.  I would also welcome comments now.  Provisionally, I intend to make these changes in my current subarray speedup branch:

https://github.com/t-brandt/jwst/tree/nsclean_subarray_speedup

An outline of my proposed change: the current version of nsclean treats a row along the detector slow axis as a time series.  That is not correct: only nsclean_subarray has the correct timing and Fourier modes.  We can run NSClean channel-by-channel on a full array, and that would be correct.  However, the channels have very similar noise patterns (with a reverse pattern in alternating channels, and areas where there are few available pixels to fit will be problematic.  We can compromise by using the other channels to help fit the noise pattern in a given channel, with either equal or unequal weights for the other channels.  If they all have equal weights, then we are fitting one pattern for the entire array: the same for each channel.  I think this will largely fix the problem with poor fits in regions where most pixels are masked.  It should also do a significantly better job of capturing 1/f noise, as the S/N will be higher and the noise model should be much closer to correct.

This will require a number of modifications to nsclean.  Most of them will look minor.  But as it involves a change in behavior, I would like to mention it here now.  

 

stscijgbot-jp commented 1 month ago

Comment by Christian Hayes on JIRA:

Timothy Brandt we're happy to take a look at the NSClean changes possibly later next week or early the following week.  Let me know when they are ready to start looking at.

stscijgbot-jp commented 1 month ago

Comment by Melanie Clarke on JIRA:

Timothy Brandt - thanks for your notebook, it very clearly explains your background modeling suggestion.

It turns out, though, that the issue is just that I was lazy in my understanding and description of the implemented method. I copied the implementation from image1overf, via David's notebook, without digging into it.

The model fit implemented is the photutils Background2D method, which already does almost exactly your suggestion.  Stats are computed in large boxes over the image (currently, 34x34), then interpolated to make the low-resolution model fit.  There is a 5x5 median filter applied, but it is to the low-resolution image, to smooth it out even more.  Testing on your synthetic data, the background from this method (without the 5x5 filter) is very similar to your suggestion.

I can make the initial box size a configurable parameter, defaulting to 32x32, but I think the rest of the implementation for background modeling should be okay as is - let me know if you disagree.

I'll start work now on implementing the additional option to project the background values from a fit to the rate image, instead of computing them directly from the group diffs.

stscijgbot-jp commented 1 month ago

Comment by Timothy Brandt on JIRA:

An update, a suggestion, and a comment for now.  

First the update: I will push my NSClean changes later today.  They seem to work well: I use all channels together to fit the higher-frequency noise components, and then use each individually to fit only lower-frequency noise.  I think this will help a lot with overfitting near sources.  I have also implemented an expansion of the mask within my routine, though this could be moved outside it too.

The suggestion: for the background, I suggest flat fielding the rate image, computing the flat field as Melanie Clarke described, undoing the flatfield, and then subtracting this from the rate images.  I think this should mostly disentangle pixel response and illumination variations from 1/f noise.

The comment: I noticed in some IRS^2 data from NRS2 that there is a little bit of what looks like a picture frame around the outside of the detector where 1/f noise is a little less.  See here, screengrab from the NSClean JDAT notebook: !Screenshot 2024-07-29 at 9.14.42 AM.png! The color scale and aliasing exaggerate the issue; it does not look nearly this bad if you inspect the image in ds9 (though the effect remains).  This is somewhat problematic for the 1/f correction, because the noise is different in different places.  I don't know if this has been noticed before.  It seems to be related to the reset value of a pixel–each pixel appears to couple slightly differently to the 1/f noise.  

 

stscijgbot-jp commented 1 month ago

Comment by David Law on JIRA:

Timothy Brandt Yes, we've noticed the picture frame effect from thermal instability on some of the very deep NIRSpec MOS data that we've taken for the GARDEN program.  Michael Regan and Eddie Bergeron have been working on a correction that we'll look into incorporating into the pipeline if possible once the details are hammered out.

I'd potentially be concerned about incorporating flatfielding into the background estimation though.  For NIRCam that might work ok, but for NIRSpec IFU at least the flatfield is NaN outside of the slice footprints, so we'd lose all of the regions of the detector that we most wanted to use for estimating 1/f.

stscijgbot-jp commented 1 month ago

Comment by Timothy Brandt on JIRA:

David Law Yikes–I didn't realize that flatfields would be littered with NaNs.  Best not to do that then.  

I have redone the full frame NSClean algorithm here:

https://github.com/t-brandt/jwst/tree/nsclean_subarray_speedup

It isn't really ready for a PR yet–there are some things that really should go into the mask calculation that I should refactor.  This is a legacy from the current implementation, which also mixes the mask creation between various places.  I would be curious for people to try it though and tell me if it seems to work.  I can write up a more detailed document outlining the changes and motivations if desired.  Briefly, only the very lowest 1/f frequencies can be fit channel-by-channel without overfitting, while combining the four output channels gives you access to a wider range of frequencies.  Proper use of the reference pixels from something like SIRS reduces read noise by about 30% over not using them at all, while a 1/f correction, assuming ~half of the array is available for the correction, can get you another 20% or so, with a much larger reduction is the visually obvious low-frequency noise.  Computational cost should be a few seconds per full frame image, I am guessing 3s-10s depending on computer; this would apply to each group.

Melanie Clarke, I am very curious if the proposed full frame correction (via do_correction, with the default parameters) solves the various issues you were seeing with the 1/f correction: does it perform visibly better than the median approach?

stscijgbot-jp commented 1 month ago

Comment by Melanie Clarke on JIRA:

Thanks Timothy Brandt - I'll check it out!

stscijgbot-jp commented 1 month ago

Comment by Melanie Clarke on JIRA:

David Law Timothy Brandt -

I added an option to compute the background from the rate image instead of the group diffs.  Set background_from_rate = True to use; it uses the group diffs by default. 

If set, the preliminary background subtracted from each diff is rate background * input_model.meta.exposure.group_time. I think this is the right integration time to use for the group diffs, but please let me know if I've got it wrong, or if there are known exceptions.

If single_mask is False, a rateints is created instead of a rate file, and the rate background is computed separately for each integration.  Otherwise, the same rate background is used for all integrations.

In testing, results look pretty close to the ones from directly computing background from diffs for the cases I checked.

To make testing easier, I also added some options to save the modeled background (save_background = True), as subtracted from the diffs prior to noise fitting, and/or the final fit noise removed from each group (save_noise = True).

stscijgbot-jp commented 1 month ago

Comment by Melanie Clarke on JIRA:

Timothy Brandt - I hacked your NSClean changes in on top of mine and tested the same NIRCam data set – image attached.  The overfitting artifacts look much better, compared to the older version!  

The differences between the median and fft cleaning methods are now pretty subtle to my eye, at least for this image. There are numerical differences, but visually, they look very similar.

stscijgbot-jp commented 1 month ago

Comment by Melanie Clarke on JIRA:

Attaching one more example, for the non-ideal case - an image with significant diffuse emission and a very large bright region.  Here, your new FFT version does noticeably worse than both the median and the older FFT version at channel boundaries, but much better than the old version at the over-masked bright region.

stscijgbot-jp commented 1 month ago

Comment by Timothy Brandt on JIRA:

Thanks, Melanie Clarke!  For the non-ideal example where the are artifacts at the amplifier boundaries, do these effects remain if you set subchannelmean to False?  This was one thing I wasn't feeling great about; it helped marginally in some cases but did feel a little risky.  It looks to me like the background model somehow wasn't getting the background quite right.

stscijgbot-jp commented 1 month ago

Comment by Melanie Clarke on JIRA:

It's a little better with subchannelmean = False, but the channel boundaries are still clearly visible.

stscijgbot-jp commented 1 month ago

Comment by Timothy Brandt on JIRA:

Ok, thanks Melanie Clarke.  Is it possible to show uncleaned minus background only for this background fit?  The background fit is very important in this case, so I am curious how well it is doing.

stscijgbot-jp commented 1 month ago

Comment by Melanie Clarke on JIRA:

Attaching the cleaned image with subchannelmean=False, starting from the rate file for simplicity – results are very similar starting from the jump file. 

Top row:

Bottom row:

stscijgbot-jp commented 1 month ago

Comment by Timothy Brandt on JIRA:

Thanks, Melanie Clarke, very helpful.  This looks like a case where the background model isn't good enough and the count rate is too high to do a decent job with the intermediate frequencies.  I might be able to finagle something, but I suspect that it's probably best to accept that the background-subtracted image has too many biases and not enough S/N to apply the FFT algorithm rather than to try to finesse that algorithm any more.  As an aside, I wonder what systematics the various approaches would show if the image were rotated by 90 degrees with respect to the channels.  

stscijgbot-jp commented 1 month ago

Comment by Timothy Brandt on JIRA:

Here is a glass-is-half-full way of looking at Melanie Clarke's recent results.  The FFT algorithm removes more 1/f noise than the median-based one.  The price of this is that the pixels that form the basis of the correction need to represent mostly read noise; there should be almost no background signal remaining (it will be added back in later).  If there is significant background structure in the image that forms the basis for the 1/f correction, the FFT algorithm will introduce artifacts.  These artifacts will be lower in the median-based algorithm, since it does not attempt to extract as much from the data.  

So, if the background is being modeled well and you are able to get a decent 1/f correction, the FFT and median algorithms should give very similar results.  This is an important check.  {}A user should almost always run both 1/f corrections{}, and if the results are difficult to distinguish by eye, the correction is probably improving the scientific quality of the data without adding too many systematics.  In this case, the FFT-based result is likely better, as it will remove more noise.  If the results with median and FFT are visibly different, it means that the science scene is making it difficult to disentangle 1/f noise.  In this case, (a) any 1/f correction beyond the use of the four rows of reference pixels along each edge should be used with extreme caution, and (b) if the user still wants to try and suppress 1/f, the reference correction with the median is to be preferred.  Further progress will likely require a combination of better modeling of the scene (to create a higher-fidelity background) and more complex analyses, like my idea to use the fact that the scene is convolved with the instrumental PSF while 1/f noise is not.

stscijgbot-jp commented 1 month ago

Comment by Melanie Clarke on JIRA:

That's one way of looking at it!  The glass-half-empty way is that the FFT algorithm is at significant risk of overfitting the noise in the data, and should be used with caution, if at all.  If the median and FFT corrections are indistinguishable, then it's probably still safer to use the median correction.

stscijgbot-jp commented 1 month ago

Comment by Timothy Brandt on JIRA:

I can also provide a tunable parameter for the aggressiveness of the FFT-based 1/f correction.  Maybe I should do that, with aggressiveness=100 being close to the current version and aggressiveness=0 doing something similar to the median approach.

stscijgbot-jp commented 1 month ago

Comment by Melanie Clarke on JIRA:

Back to MIRI imaging, David Law...

It looks like there are NON_SCIENCE flags that get added at the flat correction stage for MIRI, which block out the irrelevant light on the detector pretty well. With that data masked, the median fit does a decent job on the vertical striping with otherwise default parameters. 

I pushed up some changes for you to try: if mask_science_regions = True, it will do a draft flat correction for MIRI imaging only, retrieve just the DQ plane, and use NON_SCIENCE flags from it to add to the scene mask.  (For NIRSpec users following along, the mask_science_regions parameter now replaces the mask_spectral_regions parameter in the clean_flicker_noise step, but not in the nsclean step.)

Without borrowing the flat DQ, you can either make your own mask outside the pipeline, or play with the background modeling parameters to try to remove the non-science scene structure.  Taking the background box size way down (8x8 instead of 32x32) also does a reasonable job of removing most of the vertical striping.

stscijgbot-jp commented 1 month ago

Comment by David Law on JIRA:

Tests that I've now tried:

stscijgbot-jp commented 1 month ago

Comment by Timothy Brandt on JIRA:

Ok, I made an "aggressiveness" parameter for NSClean as an argument to "do_correction."  It can be anywhere from a minimum of zero (in which case it will function similarly to the median-based correction) to a maximum of 11, in which case it will do something close to the optimal 1/f correction if we have a good background model and the background is read noise limited.  I think this one knob should be sufficient for most users.  If the results show strong, clearly visible differences when tuning this parameter, it almost certainly means that the background model isn't great.  

stscijgbot-jp commented 1 month ago

Comment by Melanie Clarke on JIRA:

Thanks so much for testing, David Law!

For background = None: I kept this for backward compatibility with the original nsclean step, which expects that the fit data in the rate file should be corrected to zero.  I can make it an allowed option in the nsclean step in the spec2 pipeline and not allow it in the clean_flicker_noise step in detector1.

For mask_science_regions: for NIRSpec MOS, using this option requires that a MSA metadata file is available, so a WCS can be assigned to the rate file.  It wasn't clear to me if this would normally be available with the input data in detector1 - we may need to be cautious about turning this option on by default, if we ever plan to run this step in ops.

stscijgbot-jp commented 1 month ago

Comment by Melanie Clarke on JIRA:

For jw02731001001_02101_00003_nrcblong - I see what you see as well, the background_from_rate=True version looks a little stripier over the bright regions than the background_from_rate=False version.  Looking at the saved backgrounds, average computed values look comparable, so it doesn't look like a scaling issue from the readout timing.

I'm attaching an image of the background-subtracted data for this image, for the first diff - it looks like the background correction from the rate image is just poorer than the background correction from the diff, so more signal is flagged out of the data for the noise fit, and the median value has less data to work with.

stscijgbot-jp commented 1 month ago

Comment by David Law on JIRA:

Gotcha; certainly if it's just estimating the 1/f from a few pixels that wouldn't be very reliable.  Inclined to remove the option, but let's see how the testing goes for a little while.

Re background=None, maybe this option only makes sense for NIRSpec where the computation might be done using only regions of the detector that should be zero.  Unclear if even that's really necessary though- again, let's see how testing goes before deciding.

Whether or not we eventually run this step by default in ops for MOS is unclear, but good to keep in mind the MSA metadata issue.

I've also attached flicker_results.png from my email yesterday for posterity.  As per email, the MIRI imaging correction is really nice.

stscijgbot-jp commented 1 month ago

Comment by Melanie Clarke on JIRA:

Sounds good – I'll leave all the options and defaults as is until the testing dust settles.

stscijgbot-jp commented 1 month ago

Comment by Christian Hayes on JIRA:

Kayli Glidic and I were taking a look at the new step in the PR and saw for subarray data there appears to be some ringing (see attached:  jw02053003001_04102_00001_nrs2_method_fft.png) that gets added to the data at either end of the subarray when using the fit_method="fft".  I'm seeing this when the clean_flicker_noise step is run on both ramps and rates, so it seems like it might be coming from a change to the fft method itself?  Melanie Clarke Timothy Brandt, I haven't quite followed which changes made it into the fft method, but is this a result of the new subarray speed up changes?  

stscijgbot-jp commented 1 month ago

Comment by Melanie Clarke on JIRA:

Christian Hayes - thanks for testing!  I saw the same ringing effect for NSClean with subarrays. 

The change from the previous nsclean implementation comes from a small change in how masking works in the new version.  The old one forced the outer edges of the subarray to be included in the fit, even if they contained no real data.  The new one masks those pixels, since they are DO_NOT_USE in the input.  It seems like the current subarray fft clean method requires those pixels to anchor the correction at the edges.

Timothy Brandt is working on fixing the NSClean implementation for subarrays in JP-3654.  That branch is not quite ready for merging, so this PR, for clean_flicker_noise, does not include his updates.  But in testing, the new subarray method fixes the ringing issues for subarrays without further mask manipulation or edge padding.

I'm assuming the JP-3654 work for subarrays will get merged before this one does, so I haven't worked out an interim solution for the subarray edge handling.  Let me know if you'd like me to do that now - I can at least add back in the pixels at the edges to replicate the old behavior.