Closed stscijgbot-jp closed 1 month ago
Tagging Melanie Clarke , Bryan Hilbert , and Rachel Plesha to be aware and chime in if necessary.
Comment by Howard Bushouse on JIRA:
Instead of having the jump step add saturation flagging to group N-1 when NFRAMES>1, perhaps it would be cleaner and more obvious to have that done already in the saturation step?
Thanks Howard Bushouse ; I agree that makes more sense.
Adding a crude ds9 plot just to make the point for completeness; hacking this change into the pipeline as a post-hook indicates that it should clean up spectra substantially. spec.png left panel is the spectrum in a sky region in the old cube, right is the new cube with this saturation flag change. Previously ultra-bright artifacts messed up the ability to see the spectrum in default scaling, whereas the fixed version is substantially improved. Still some gains to be had, but this should hit the tallest poles.
Also tagging Jane Morrison to be aware of this.
Comment by Howard Bushouse on JIRA:
The description given above for the desired change suggests that any time NFRAMES>1 the group before the one that actually crosses the defined saturation threshold should also be flagged as saturated. Just wondering if there would be any advantage to making this dependent on the exact value of NFRAMES. For example, I can imagine that if NFRAMES is only a few (2-4), then a hit in one of the latter frames could have a much larger impact on the frame-averaged group value, than if NFRAMES were much larger (say > 8). Thoughts?
Comment by Kevin Volk on JIRA:
This needs more discussion if it is to be a general change that applies to all modes not just to the NIRSpec IFU case. So yes this issue with cosmic rays is a known problem, which for example causes many snowballs to not get flagged in the jump step for the frame-averaged cases depending upon where within the group the snowball appears. However, the proposal to flag group N-1 if group N is over the linearity limit will have an adverse effect on the stars in the field and will change the bright limit down to fainter stars by a factor of 2. This is not a good thing when we need the stars to align to Gaia, for example. My own feeling is that for imaging mode generally it is much better to NOT make such a change; the cosmic ray events that pass through the jump step are removed in level 3 when then different dither positions are combined, so this issue does not generally have an adverse effect on the output science products. We see cosmic ray and snowball residuals in the rate images in NIS read-out all the time, but these are nearly always removed very well in the level 3 products. I assume that the same is true for NIRCam.
Another case where this definitely is not a good idea is the NIRISS SOSS mode where we wish to retain as many groups as possible below the linearity correction threshold (although that normally does not use frame averaging). Flagging another group as "saturated" is very bad for SOSS mode especially because they are often pushing the bright limit and short ramps are common.
For NIRISS there is also the question of how this change might interact with the charge migration correction, although the charge migration threshold is somewhat lower than the linearity limit for most of the pixels, and one could hope that the gap is large enough that such a change would not be an issue.
If this change is specific to the NIRSpec IFU case then all this is moot, but the ticket seems to suggest a general change.
Kevin Volk Following up on this to understand a little better. The issue really seems to be for very-bright sources in readout modes optimized for faint sources. Further, saturation in group 2 is irrelevant (since no slopes), and group 4+ doesn't matter as much (since the abnormally-bright slope can be detected as a jump), so we can focus most on cases where saturation is currently being detected in group 3. This can happen two ways; either from a cosmic ray, or from a very bright source being observed with a readout pattern optimized for faint sources.
For the 4-read groups in NIS, I think the readout would be 1234-5678-9abc for such three-group ramps. If the third group is flagged as saturated but the first is not, I'd assume that would mean that saturation actually occurred in either frame 6,7,8, or 9. If it occurred in frame a, I would think that the group average of 9abc would be 97.5% of saturation level. I.e., most of the time the group 2 average of frames 5678 would also be biased, regardless of whether saturation occurred due to a CR or the source itself.
For purposes of alignment stars, I assume a small bias in the slope is still a good enough result, and preferable to losing the pixel entirely? Is that also the case for SOSS though, where a likely-incorrect slope is preferable to no slope?
It may be possible to split by imaging/spectroscopy; are there any observations that have been obtained by SOSS using the NIS read pattern? I thought this mode required the NISRAPID pattern?
I think the difference between hard and soft saturation may be important here. For NIRCam, our saturation levels are defined as the levels where the signal becomes non-linear by some amount. So after a pixel crosses the saturation threshold, it's signal can continue to increase. So in the case above where the signal reaches the saturation threshold in frame a, depending on what happens to the signal in frames b and c, the group average could conceivably be above or below the saturation threshold.
Comment by Kevin Volk on JIRA:
FIrst off, the selection of NIS mode may be due to data volume constraints and not primarily because the science targets are faint. A lot of NIRISS observations are in parallel with NIRCam and use NIS even for relatively short ramps because of the data volume issue. Secondly, one still wants to see the brighter objects especially stars along with the faint objects. This is not only due to aligning images to Gaia, although that is important, but also because existing catalogues like 2MASS or WISE only give data for the brighter objects and one will typically want to make some comparison with existing catalogues for the photometry in the images. NIRISS almost always has stars that exceed the linearity limit in group 3 when NIS observations are made.
Your statement that if the "saturation" happened in frame 10 for a NIS read-out ramp that the previous group would have the data compromised is not correct. If the linearity limit is exceeded in frame 10 then there is no reason to say that there has to be an issue with the linearity correction in frames 5 through 8. Yes the linearity correction is less accurate with the frame averaging, but quantifying that is a bit difficult. Bryan Hilbert did some work on the effects of the frame averaging on the linearity correction some time ago. I have not tried to quantify this.
This is why I really do not like using the term "saturation" for this linearity limit. These are two different things. It is a really bad idea to call it saturation when it is something very different. For NIRISS, roughly half the pixels on the detector reach the A/D limit before they reach actual saturation, and in some cases the linearity correction remains valid almost right to the A/D limit. So we have not just the linearity limit value but also two additional possible limits to worry about.
For your example, if you have a 3-group ramp in NIS readout with frames denoted 1234-5678-9abc as you put it, for uniform illumination if group 3 is above the linearity limit this means that at least frames bc are above the limit. There is however no way to tell solely from the "saturation" flag whether the situation is that frames 89abc or frames 9abc or frames abc or just frames bc are above the limit. It depends on the shape of the non-linearity curve around the limit, which varies from one pixel to another, and the actual slope needed to cross the limit in less than 12 data frames but more than 7 data frames (I am assuming that if the limit is exceeded in both frames 7 and 8 the result for this case will be a value over the limit in group 2...one might have an edge case where this is not what happens if the frame 7 value is just slightly over the limit).
Finally for SOSS mode the exoplanet people have told me that they would rather have groups that consistently have a small systematic in it from the linearity correction, say, rather than having no data at that frame or group. One can use NIS readout for any of the NIRISS modes, although for AMI and SOSS observations this is rather unusual.
Kevin Volk Thanks for the explanation, that's helpful.
After discussing a little with Michael Regan, one possible approach would be to revise the algorithm to more carefully target the specific CR failure case in question.
1) We focus specifically on looking for saturation in group 3 for data with nframes/group > 1.
2) For any pixel that meets that criterion, look at the counts recorded in group 1. If the number of counts is more than 1/8 (value TBC) of the way to the saturation limit, then do nothing. If the number of counts is less than that, set the SATURATED bit in GROUPDQ for that pixel in group 2 as well.
That might allow us to tease apart the case where saturation is happening due to a CR vastly above the normal count rate (which we want to flag), vs due to a bright source (which we don't want to flag). For saturation due to CRs in anything larger than group 3 we don't make any changes, as the existing code has enough logic to catch these via jump detection. This approach could potentially fix the NIS residuals at the detector1 stage as well without compromising the cases that you described above.
Thoughts?
Comment by Kevin Volk on JIRA:
It may be that this proposed revision will work and not flag pixel core pixels stars in imaging observations. I am still very leery of making a change to the linearity limit flagging for NIRISS imaging to fix a problem that does not actually affect imaging observations, but at least we can test this change and see if it does what you want for the NIRSpec case without causing issues for NIRISS imaging or WFSS observations.
I would also like to verify that this is specific to the case of saturation in group 3 of a ramp with frame averaging; the description states that the idea is to flag group N-1 as "saturated" when group N is above the "saturation" limit value, not that this is specific to the 3rd group in such ramps.
Kevin Volk The suggestion above was originally to fix the general case of all such groups N, but given this discussion it seems better to refine to the specific case of saturation in just group N=3.
Comment by Howard Bushouse on JIRA:
David Law Can you please clarify your statement in the original description of the issue "Likewise, it's not an issue when saturation is first flagged in group 3, as this simply results in no slope at all." If saturation is first flagged in group 3, then groups 1 and 2 are classified as OK and can be used to compute a slope. Did you by any chance mean to say "first flagged in group {}2{}"?
Howard Bushouse Thanks; that was meant to say it's not an issue when saturation is first flagged in group 2. Fixed in the original text.
Comment by Eddie Schlafly on JIRA:
For Roman we have this same issue in the sense that a read can saturate within a group, such that the group does not appear to be saturated but it contains reads which are saturated. There we address this by adding an extra "read_pattern" argument to the saturation flagging
https://github.com/spacetelescope/stcal/blob/main/src/stcal/saturation/saturation.py#L87-L93
which adjusts the saturation threshold for each group according to how diluted a read at the end of the group would be by earlier unsaturated pixels. That dilution factor is naturally 1 for nframes = 1 (i.e., no difference there), so it behaves correctly for that case. It depends on the number of frames in the group, and so it becomes more aggressive at flagging saturation when there are a lot of reads—I think that's correct. I kind of think this approach is the right one to use for Webb as well. Sorry for being late to the party.
Eddie Schlafly I'm not sure I fully understand that method. Let's say nframes=5 per group, and ngroups=3 in the ramp. Say saturation (in the nframes=1 case) is 50,000 DN. The dilution factor saturation would then be 50,000/5=10,000 DN, right?
But what if you're looking at a bright source that accumulates at the rate of 2500 DN/frame? Then you get group1=7500 DN, group2=20,000 DN, group3=32,500 DN. Nominally both groups 2 and 3 are saturated according to the dilution factor, although if you were to actually look at the final frame in group 3 it would have just 37,500 DN and not actually be saturated?
Comment by Eddie Schlafly on JIRA:
In this case the dilution factors are: group 1: mean(1,2,3,4,5)/5 = 0.6 group 2: mean(6,7,8,9,10)/10 = 0.8 group 3: mean(11,12,13,14,15)/15 = 0.87 So the saturation cuts would be at 30k, 40k, and 43k and no group would be marked saturated. I'm on my phone so I should check my math tomorrow, but that looks right to me?
Ah, I see, I misunderstood how the dilution factor worked. Thanks for clarifying. In that case I agree for the bright source case but would have the opposite question, in the faint-source with cosmic limit. Suppose an empty part of the field with 1 DN/frame, and an instantly-saturating CR (value 60,000 DN say, with flagging at 50,000) in frame 10. Then group 1 mean is 3 DN, diluted CR threshold is 0.650,000 = 30,000 = All good group 2 mean is 12,006, diluted CR threshold is 0.850,000 = 40,000 = Not flagging a case that it should group 3 mean is 60,000, so obviously flags ok
In order to successfully flag the CR in group 2 you need to saturate in frame 6 or 7, but it doesn't catch frames 8/9/10?
Comment by Maria Pena-Guerrero on JIRA:
I still don't quite understand the dilution method. I was trying to run a test with jwst data but I got an error because the code is expecting this:
read_pattern : List[List[float or int]] or None
The times or indices of the frames composing each group
Is this how Roman has read_pattern in the datamodel or is there something special that you do to create that list?
Comment by Eddie Schlafly on JIRA:
Ugh, thanks, sorry for being dense. Yes, I agree that that doesn't get handled well by the dilution factor (though I think the dilution factor does make sense for Webb, independent of this issue).
How do you like the pseudocode proposal below? It tries to use read_pattern to generate appropriate cuts along the lines of your proposal, and adds an additional cut that the value in the second group must be high before being considered as potentially affected by saturation.
for group in range(groups):
if group == 2:
mask = data[0] / mean(read_pattern[0]) * read_pattern[2][-1] < sat_thresh
# identify groups which we wouldn't expect to saturate by the third group, on the basis of the first group
mask &= data[1] > sat_thresh / len(read_pattern[1])
# identify groups with suspiciously large values in the second group.
mask &= flag[2] & SATURATION != 0
# identify groups that are saturated in the third group
flag[1, mask] |= SATURATION
# flag the 2nd group for the pixels passing that gauntlet.
IMO you could generate the read_pattern from nframes by
[[x + 1 + groupstart * nframe for x in range(nframe)] for groupstart in range(ngroup)]
or derive the relevant terms directly; I'm obviously partial to read_pattern for Roman only because we want to be general there.
Eddie Schlafly I think that should work. Testing out a quick implementation of your pseudocode it does essentially what the algorithm proposed above was doing, but without the rather hacky and arbitrary factor of 1/10 that I'd suggested which should make it more general to different numbers of nframes per group.
Comment by Eddie Schlafly on JIRA:
Great, yes, I was clearly taking your proposal and trying to make it slightly more generic to be able to accommodate Roman's uneven MA tables! Slightly nice feature that nframe = 1 is no longer a special case since then data[1] > sat_thresh / len(read_pattern[1]) is just the normal saturation cut.
Comment by Maria Pena-Guerrero on JIRA:
Eddie Schlafly are you putting in a PR for the changes in this ticket with your changes to the code? should I close the current WIP PR?
Comment by Eddie Schlafly on JIRA:
Thanks Maria. I wasn't planning to put in a PR; I was mostly just looking for a way to implement the changes such that they would work for both Roman and Webb. I think my suggestion satisfies that need, so if you were able to update the PR along those lines then I would stop objecting on grounds that I don't know what things mean for Roman. Thank you!
Comment by Maria Pena-Guerrero on JIRA:
I implemented the changes suggested. Can you Eddie Schlafly and David Law test these? The branch in stcal is: spacetelescope/stcal#259
to install do:
pip install git+https://github.com/penaguerrero/stcal@jp3593
And the branch on jwst is : https://github.com/spacetelescope/jwst/pull/8499/
pip install git+[https://github.com/penaguerrero/jwst@jp3593]
Comment by Kevin Volk on JIRA:
I am still uneasy about this change and we will have to see how it affects NIRISS imaging and WFSS data. If we find that it is adversely affecting our observations this will need to be made mode-specific somehow.
Comment by Maria Pena-Guerrero on JIRA:
David Law is the new bit of code still only happening for nframes>1 (I did not include that condition, so I want to verify if I should)?
Comment by Howard Bushouse on JIRA:
Maria Pena-Guerrero As Eddie Schlafly explained to me when I asked the same question in spacetelescope/stcal#259 the code as implemented does not explicitly test for nframes when applying the algorithm, but the way it's implemented will only have any effect when nframes>1.
Maria Pena-Guerrero Hm, I haven't yet been able to run down why, but it seems the new code isn't currently flagging the things that I'd expect it to flag. I'm testing on jw01908004001_02101_00001_nrs1 and looking at pixel X,Y=1557,1681.
Weirdly, groupdq written out by the jump step shows this pixel as SATURATED in group 3, but if I pause your code in the middle, gp3mask is FALSE at the location of this pixel?
Will try to see if I can help troubleshoot later today.
Maria Pena-Guerrero Howard Bushouse Eddie Schlafly
Ok, I've spent a while digging into this over the last few days. At the moment there are two PRs:
spacetelescope/stcal#259 and #8499
The stcal PR implements what's discussed here for NIRCam and NIRISS (i.e., non-IRS2 data) while the jwst PR implements it for NIRSpec (i.e., IRS2) along with some hooks necessary to activate the stcal code for NIRCam/NIRISS.
For NIRSpec things work fine (modulo a small bug), and it fixes the original problem that led to filing this ticket.
We've got a couple problems for NIRCam and NIRISS though, chief among them the dilution_factor code that gets activated if jwst is changed to pass stcal a read_pattern argument. dilution_factor essentially decreases the saturation flagging threshold for grouped data, which is what you want when you saturate in some frame within a group and the group averaging smears that out. However, it also means that we'd now be flagging things that never really saturate at all. E.g., a cosmic ray (or just plain source flux) takes us above the diluted saturation threshold in group N, but we never cross the real saturation threshold in group N+1, or even N+2. As a result, I'm seeing way more pixels flagged than really need to be, even absent doing any of the third-group juggling.
At the same time:
These interact in some pretty complicated ways, and in the stcal code they'd affect both JWST and Roman. As a result, for 11.0 which closes in just a day or two, my feeling is that we should instead consider merging
This splits out just the IRS2 related changes, which should give us more time to consider the changes that affect other modes.
Comment by Eddie Schlafly on JIRA:
Can you help me with a case where the dilution factor code flags things that never saturate at all? It seems to me that the dilution factor is adjusting the saturation threshold by just the right amount so that when the last read of the group would saturate, you mark the group as saturated.
I have not thought hard about interaction with large bias levels. In Roman, as I imagine in Webb, the saturation step occurs before the bias subtraction. I agree that for some of this logic around "is the rate high in the first read" one wants to be able to estimate the rate using a bias-subtracted image.
Comment by Howard Bushouse on JIRA:
David Law I'm OK with just merging #8593 for inclusion in B11.0, since it was the NIRSpec case that triggered the issue to begin with. Y'all can then continue to debate over the right way to handle "generic" cases without me.
Eddie Schlafly As an example, see jw01617001001_02101_00001_nrcblong_uncal, integration 0, pixel (x,y)=(1227,953)
The group values going into the saturation routine are
[43798., 46987., 47006., 47016., 47022.]
Normally that should be fine, because
sat_thresh[953,1227] = 53852.4
However, dilution_factor = 0.5625 (DEEP8 data, so for group 0 there's a factor of two difference between the mean read_patt and the end-of-group read_patt)
so the effective saturation threshold is 30291.97, and everything from group 0 onwards gets flagged as saturated.
Perhaps this needs some kind of multiple-pass approach, where the dilution factor in group N is only applied if saturation in group N+1 is reached without dilution?
Just to make life more exciting, this also interacts with the snowball routine, such that excessive saturation flagging in a single pixel from the above sometimes causes much larger flagged regions due to the snowball expansion.
Comment by Eddie Schlafly on JIRA:
Ah, thanks! Is the large constant offset from the bias or from a half-saturating cosmic ray? I agree with you that an additional cut requiring that the next pixel be fully saturated would be consistent with the dilution factor concept.
Eddie Schlafly In this case I think it's from a half-saturating CR.
Comment by Howard Bushouse on JIRA:
NIRSpec IRS2 saturation check updated by #8593
Leaving this ticket open for now, until same checks are implemented for NIRCam and NIRISS, in the portion of the saturation step that lives in stcal.
Here's a quick illustration of how this now-merged fix helps NIRSpec IFU spectra (tagging Christian Hayes and James Muzerolle ). This is a comparison of the default pipeline dither-combined x1d spectrum from PID 1222 Obs 7 with and without the new code.
Improvements will also affect MOS mode, but be a little less dramatic since spectra don't sum over quite so many detector pixels.
!nrsifu_irs2_crfix.png|thumbnail!
Ok, I've tinkered with a few different ways that we might adapt the stcal implementation, and ended up adding in a second loop going backwards through groups that applies the dilution factor. See https://github.com/drlaw1558/stcal/tree/jp3593_take2
However, while this now seems to do minimal harm to NIRCam/NIRISS and not be flagging lots of unnecessary things, it's not clear that it's actually making any improvement either. As such it may be a solution in search of a problem now that NIRSpec has been fixed, although I don't understand why this issue was so much more pronounced for NIRSpec in the first place.
Eddie Schlafly would fixing the dilution factor implementation be useful for Roman, even if JWST isn't using it just yet?
Comment by Eddie Schlafly on JIRA:
Thanks David. I do think that it is important to fix this for Roman and I like your implementation. I agree with you that it isn't urgent if it's only affecting Roman right now, though.
Comment by Melanie Clarke on JIRA:
David Law - have you tested with non-IRS2 NIRSpec data (traditional readout)? It would be helpful to know if that kind of data is significantly impacted, like the IRS2 data is, and how your new changes work with it.
Melanie Clarke Good question, I'd forgotten that regular NRS-mode was grouped too. Just tested things out on jw02747006001_03101_00001_nrs1, which is a 'NRS' mode IFU observation with 4 frames/group.
Activating the existing readpatt/dilution implementation is noticeably worse than existing B11.0; more things flagged that don't need to be, and also more outliers in the data. Looks like the reason why is CRs that never really saturated but passed the diluted cut in group 3, leaving just two groups for a ramp, one of which was affected by a CR.
With the fix proposed above the NRS data improves somewhat, but not as much as I'd like. Still some 1000-valued CR blobs in an image that's otherwise in the 1e-1 range.
I've addressed this by also adding in the code that we put into the IRS2 routine, focusing specially on group 2. This successfully deals with the 1000-valued CRs while adding no visible further changes that I can see to the NIRCam/NIRISS data.
Comment by Melanie Clarke on JIRA:
Excellent - thanks for checking that out.
Comment by Maria Pena-Guerrero on JIRA:
David Law, just to confirm, we can close the original PR, correct?
I'm referring to this one: #8499
and the one in spacetelescope/stcal#259
Maria Pena-Guerrero Yes, I think we can close these.
Comment by Maria Pena-Guerrero on JIRA:
thank you, those branches have been closed
Comment by Kevin Volk on JIRA:
A general comment: for NIRISS NIS read-out data (4 frames/group) the proposed change does find and mask the "split" cosmic rays that put the frame above saturation if these happen in frames 6, 7, or 8 within group 2 of the ramp. It will erroneously throw out data for cosmic rays that put the frame above saturation when this happens in frame 9, because group 3 will show up as saturated and if the signals in groups 1 and 2 are low then group 2 will be flagged as saturated even though it is not affected by the cosmic ray. Hence while there is a net gain it is at the cost of throwing out good data. For NIRISS imaging and WFSS modes where there is dithering, the dithering normally takes care of the pixels with erroneously large ramp slopes from these split cosmic rays when the level3 products are produced. Because of this situation it would still be better to not use this option for NIRISS. It has been indicated that this cannot be done automatically, but there should be an option to turn off this flagging in the step call so people can re-reduce their data without this extra flagging of group 2 when group 3 is saturated and the group 1 signal is low.
Kevin Volk Ideally that wouldn't be the case, as there are multiple criteria that need to be satisfied.
That said, we'd want to have proof of this once there's a branch to test. At the moment I'm not happy with it though, and I think it needs some more work before we can assess it.
Issue JP-3593 was created on JIRA by David Law:
Edited based on discussion thread below.
Examination of a few extremely-large outliers in PID 1908 (NIRSpec IFU) indicates that these are being caused by cosmic rays that are not being properly flagged in the jump step.
In this case, ramps are 22 groups long. Since the data is grouped though (nframes/group=5) cosmic rays that cause saturation in group 3 are also significantly biasing group 2 half-way to saturation. Since all other groups are saturated the pipeline assumes that groups 1 and 2 are good, resulting in extremely bright ramps.
This isn't as much of a problem when saturation is first flagged in group 4 or greater, as there are multiple differences with which to find and reject the abnormally large slope. Likewise, it's not an issue when saturation is first flagged in group 2, as this simply results in no slope at all. A solution can thus concentrate specifically on the case of saturation in group 3.
At the same time, we want to be careful not to introduce excessive flagging when saturation occurs due to normal counts received from a source, as this would decrease the effective dynamic range of some observations.
The proposed solution is therefore to modify the saturation step. For any data obtained with nframes/group > 1, look for pixels for which SATURATION is set in group 3. For each of these pixels, look at the counts in group 1. If the count value in group 1 is less than 1/10 of the value that would be required to trigger SATURATION (i.e., it isn't obviously a very bright source), then also set the SATURATION flag for this pixel in group 2.
Once this has been coded it will be necessary for testing to occur on both NIRSpec IFU data to ensure that it fixes the original problem, and on NIRISS/NIRCAM data to ensure that it doesn't produce any unintended consequences.