Closed stscijgbot-jp closed 2 years ago
Comment by Anton Koekemoer on JIRA:
scheduled for discussion at JWST CalWG meeting 2021-04-13
Comment by Karl Gordon on JIRA:
Not sure this was discussed in the JCalWG context. A few comments from my point-of-view.
The background should not be subtracted by default.
While there are cases where this is fine (e.g., deep fields with lots of "sky" between fairly isolated compact sources), we will also have lots of cases where this is not fine. For example, where there is significant diffuse emission due to gas and/or dust. This will happen quite a lot at MIRI wavelengths and in narrow band NIRCam filters. The backgrounds values should be matched between images, that is all. Subtracting the background to make all the images have zero background on average will remove real and important astrophysical signal an producing images with significant, systematic negative values for a large portion of the mosaic.
Comment by Karl Gordon on JIRA:
To answer the question in the body of this issue. The mean of the input sky backgrounds should be added back in, if I understand the skymatch algorithm correctly.
Comment by Misty Cracraft on JIRA:
Also, if the skymatch step has the parameter 'subtract=False', which is the default, the assumption is that the background should not be subtracted. According to our tests, this subtraction happens even when the parameter is set to false. Is this a bug that should be filed in another ticket, or another small piece of this discussion?
Comment by Howard Bushouse on JIRA:
The skymatch subtract
argument only affects the processing within the skymatch step itself, such that it only computes overall sky values without subtracting them. The values get stored in keywords for use later during resampling. So we would need another argument added to resample to not subtract background (if there isn't one already).
Comment by Misty Cracraft on JIRA:
Ok, thanks for the clarification. I think Mattia is doing some testing to see what that looks like (resample not subtracting). I think the current discussion needs to continue and the best way forward needs to be decided before we figure out if another keyword is added to resample to control subtraction at that stage.
Comment by Mattia Libralato on JIRA:
In "resample_many_to_many" and "resample_many_to_one" in resample.py, there are these lines:
blevel = img.meta.background.level
if not img.meta.background.subtracted and blevel is not None:
data = img.data - blevel
else:
data = img.data
I think that the issue is the "not". Indeed, I printed img.meta.background.subtracted and blevel before the "if" statement, and the former is False, while the latter is a number. So the "not" makes the first argument true, and it subtracts the bkg.
That said, there is also another issue here. If the blevel is not applied (so "else" case), the bkg is not properly registered and the resulting image (with bkg) is not OK.
Comment by Howard Bushouse on JIRA:
Yes, img.meta.background.subtracted is a record of whether background already got subtracted in the skymatch step. If not, then resample subtracts it, by default. This will need to be updated to allow the subtraction to be switched on or off.
Comment by Karl Gordon on JIRA:
The background matching step is only supposed to match the backgrounds, not subtract the average to get zero. This is what was requested originally. So this is not new work and is a bug. No option is actually needed. Just change the code to not subtract the average background preserving the actual measured background. Note, the background is actually a combination foreground emission (e.g., telescope, zodi) and astronomical backgrounds (e.g, diffuse emission from dust).
https://outerspace.stsci.edu/display/JWSTCC/Vanilla+Image+Background+Matching
Comment by Howard Bushouse on JIRA:
The background matching step is set, by default, to NOT subtract the background. It only computes and stores the information. So that part is OK, I think. It's the resample step that is then using that background info to automatically subtract background when it's creating an output image. So that's the step that needs fixing (or at least the addition of a step param to toggle the background subtraction on/off).
Comment by Anton Koekemoer on JIRA:
Added as watchers the various others from the Cal WG 2021-04-13 meeting who expressed interest in this (Kevin Volk Matteo Correnti Michael Regan Greg Sloan Cheryl Pavlovsky Swara Ravindranath).
First, I agree with Karl Gordon and Howard Bushouse that the actual background matching step (SkyMatch) would be expected to just measure and record the backgrounds and their differences, and not actually subtract it.
But when we get to Resample (as per the title of this ticket "Should resample automatically subtract background?" – it does need to adjust the backgrounds in some way, because drizzling images together without removing their potentially different time-variable backgrounds can lead to photometric issues for sources. There are at least a couple of different philosophies for this:
1) subtract the full background from each exposure, producing the usual astrodrizzle-style behaviour of a drizzled product with a background level centered around 0. This is fine for datasets with mostly empty sky, but can cause oversubtraction of real astrophysical signal with large extended sources.
2) subtract only the differences, or equivalently insert a DC offset into all the exposures, corresponding to some common value – this could be the average background as mentioned, or median, or perhaps the minimum background (measured from among all the exposures), or some other estimator. The average or median background may still be subject to oversubtraction; the minimum background could be more conservative in terms of avoiding oversubtraction of real astrophysical signal though it can be biased, and another estimator may be better.
But the key point is that resample should at least subtract something, to ensure that time-variable backgrounds are removed from the exposures before drizzling to combine them.
Comment by Mattia Libralato on JIRA:
I looked into skymatch and it seems that if the step is run with "skymethod=MATCH", the code does what we want, i.e., it adjusts the background to that of one image (the image with the lowest background if "match_down=TRUE", otherwise it uses the image with the highest background). The final image obtained by running the code with "skymethod=MATCH" and "subtract=TRUE" is identical to that obtained by using "subtract=FALSE". This happens because:
1) with "skymethod=MATCH" and "subtract=TRUE", skymatch calls "apply_sky", which not only stores the value, but also subtracts it:
old_sky = img.sky
if do_skysub:
img.image -= sky
img.sky += sky
new_sky = img.sky
Then, in resample, on these lines:
blevel = img.meta.background.level
if not img.meta.background.subtracted and blevel is not None:
data = img.data - blevel
else:
data = img.data
The code follows the "else" path. So, the drizzle code works with "img.data", which is already adjusted for bkg differences.
2) with "skymethod=MATCH" and "subtract=FALSE", skymatch calls "apply_sky" and only stores the value. Then, in resample the code follows the first path in the "if" above and subtracts "blevel", which is NOT the total background, but just the delta_bkg.
If you use "skymethod=GLOBAL+MATCH" (the default method), things get more complicated. First, skymatch works as "skymethod=MATCH" and "subtract=FALSE" (so delta_bkg are stored but not subtracted to the images). Then, it computes the minimum background among all images, and adds it to the delta_bkg measured in the "skymethod=MATCH" part. Then, in the resample step, the code reads the blevel (delta_bkg+min_bkg) and it subtract it from the images. For this reason, even if "subtract=False" in skymatch, the background is subtracted in the resample step. As in the previous example, it doesn't matter if you select "subtract=TRUE" or "subtract=FALSE".
Bottom line: I believe that we just need to use the correct method. If we use "skymethod=MATCH", we register the background and obtain a final drizzled image with a non-zero background. If we use "skymethod=GLOBAL+MATCH", then everything is subtracted because of how the skymatch routine is designed, i.e., it sums the delta_bkgs and the min_bkg.
Comment by Howard Bushouse on JIRA:
So if we simply change the default value of the "skymethod" param to "MATCH" in the skymatch step, we should get the desired result (the delta bkgs are removed, but the global background remains)? That's easy enough.
But it also seems like there should be a user-settable switch in the resample step to turn off bkg subtraction, when desired, even if we have bkg subtraction turned on by default.
Comment by Michael Regan on JIRA:
Can't this just be done by changing the parameter reference file? Is that an easier change?
Comment by Mattia Libralato on JIRA:
A simple solution would be: if "skymethod=MATCH" is selected, we could automatically switch (i.e., override the input values) to "skymethod=GLOBAL+MATCH" when "subtract=TRUE". This would subtract the background with the current implementation.
For "skymethod=GLOBAL+MATCH" and "subtract=FALSE", I think the easiest fix would be to just switch to "skymethod=MATCH" and "match_down=TRUE". This way (I think) the delta_bkg would be computed wrt the lowest background, and the output image should have this background.
I will test these changes and see if they work.
When there is no overlap between images, should we allow options that rescale the background to the lowest/highest one and subtract it? The net result is, e.g., one are has background~0 and one has background >0 or <0 depending on the background difference. If there is no overlap, would it be better to leave the background as it is if "subtract=FALSE", or remove it using "skymethod=LOCAL" if "subtract=TRUE"?
Note that all these suggestions would work with "GLOBAL+MATCH" or "MATCH". I haven't tested (1) the "LOCAL" option; and (2) the case with many sets of images that overlap in groups (like a custom mosaic with a specific dither pattern that makes images to overlap, but each mosaic position is far enough from the other that the various groups do not overlap). Also, I see that there are different paths when a skygroup (I still have to understand what it is) is provided, which should be tested as well.
Comment by Karl Gordon on JIRA:
In the case of no overlap between images, I would advocate for doing nothing. We are in a pipeline environment and so we shouldn't do something that will work ok for some cases (isolated sources with mostly background), but not others (images with lots of extended emission).
Comment by Howard Bushouse on JIRA:
Michael Regan Yes, this could be accomplished via either pars-image3pipeline or pars-skymatchstep ref file additions to CRDS, which would set "skymatch.skymethod='MATCH'", but changing the default in the skymatch step code is also a trivial operation. In fact, if folks can come to agreement to make that change to the default "skymethod" value in the next ~24 hours, we can include it yet in B7.9.
Comment by Howard Bushouse on JIRA:
Not sure how much testing Mattia Libralato has done, but it sounds to me like if we simply change the default value of skymatch.skymethod from "GLOBAL+MATCH" to just "MATCH", it would compute and store values that represent just deltas between each (overlapping) image, at which point we may not need any changes at all to the resample step, because if we let it continue to automatically subtract the "sky" values computed by skymatch, it will be subtracting only the deltas - which you want, in order to get a flat mosaic - but not any absolute/global sky signal. Is that correct?
We could potentially also add a new parameter to the resample step to allow the sky subtraction to be toggled on/off, but if the above scenario works as desired, it sounds like the default should be subtract sky (as it hardwired to do now).
Comment by Karl Gordon on JIRA:
We are planning on discussing this topic based on Mattia Libralato's and Misty Cracraft's testing at the STScI MiRI imaging subgroup meeting this afternoon. I'll post an update afterwards.
Comment by Karl Gordon on JIRA:
We discussed this topic at length at our meeting today and request the default parameters to be:
skymethod="match", match_down="True", subtract="False"
Just to be explicit as possible. I think it is the same as you state above Howard Bushouse, but am not 100% sure.
Comment by Mihai Cara on JIRA:
I did my best to understand the issue here (like what is broken?) but I am confused. I did not participate (at least not in any major way) in the development of current implementation of CR-rejection and resample steps and so I will speak from my drizzlepac experience and what I believe should be happening (or how things should be working).
NOTE: skymatch has an option to "subtract" background from input images instead of just recording computed value. This is just an option but I do not recommend using it. This option was introduced to match capabilities of the drizzlepac which can accept user-subtracted images. This does not contradict first point above that says images must be matched "for real".
CR-rejection algorithm should have access to the real pixel values in order to estimate variance so that it can decide whether pixel value departure from the median image is within an acceptable range or not (CR). We could subtract background from the images but then CR-rejection step should add it back to estimate these errors (but it should be using background-subtracted images when building the median image). I do not know specifics of how this was implemented in the CR step and so "subtract=True" may not even be a viable option if CR-step cannot handle this situation correctly.
Presence of "global+" should not have any effect if CR-rejection is implemented correctly i.e., it is added back for variance estimation when subtract=False or it is subtracted from the images for median image creation or resample steps if subtract=False. I discussed this with James Davies [X] but I do not know what was decided in the end.
In the end, I agree with proposed by Karl Gordon default parameters:
??skymethod="match", match_down="True", subtract="False"??
except I would have chosen match_down=False to be more conservative when estimating errors from pixel values (i.e., err ~= sqrt(I)).
In the end, I do not understand what this ticket is about: is it about preferences about having sky subtracted from the final resampled image or is there a bug/something wrong in the current code? If it is the later, then switching to "match" (no "global") may mask a deeper problem in the steps but is is the wrong kind of fix.
Comment by Tyler Pauly on JIRA:
Not to distract from Mihai's comments, but the requested changes result in differences to final products in the regression tests, e.g. test_nircam_image_stage3:
AssertionError:
fitsdiff: 5.0
a: /data1/jenkins/workspace/RT/JWST-Developers-Pull-Requests/clone/test_outputs/popen-gw14/test_nircam_image_run_detector1pipeline0/jw42424-o002_t001_nircam_clear-f444w_i2d.fits
b: /data1/jenkins/workspace/RT/JWST-Developers-Pull-Requests/clone/test_outputs/popen-gw14/test_nircam_image_run_detector1pipeline0/truth/jw42424-o002_t001_nircam_clear-f444w_i2d.fits
Primary HDU:
Headers contain differences:
Keyword BKGLEVEL has different values:
a> 0.0
b> 0.1508732994059528
Extension HDU 1 (SCI, 1):
Data contains differences:
Data differs at [1565, 4]:
a> 0.15663332
b> 0.005760026
Data differs at [1566, 4]:
a> 0.12281047
b> -0.028062837
Data differs at [1567, 4]:
[...]
Extension HDU 8 (HDRTAB, 1):
Data contains differences:
Column BKGLEVEL data differs in row 0:
a> 0.0
b> 0.15087329940595284
Column BKGLEVEL data differs in row 1:
a> 0.0
b> 0.15087329940595284
Column BKGLEVEL data differs in row 2:
a> 0.0009531162879104502
b> 0.1518264156938633
Column BKGLEVEL data differs in row 3:
a> 0.0009531162879104502
b> 0.1518264156938633
Column BKGLEVEL data differs in row 4:
a> 0.0012056025542624728
b> 0.1520789019602153
Column BKGLEVEL data differs in row 5:
a> 0.0012056025542624728
b> 0.1520789019602153
Is this the intended result, that the saved background values are now different?
Comment by Howard Bushouse on JIRA:
Regarding the changes in the regtest results for calwebb_image3, yes we expect differences in the BKGLEVEL keyword values for all images used in the test, as well as differences in the overall signal level in the final combined i2d product. Using the new skymatch parameter value, it will only compute differences in the sky levels between each image. Not sure which image it uses as the reference in that case (first one in the ASN list?), but you would then expect one of the images to have BKGLEVEL=0.0, while the others have some small non-zero values, both positive and negative. Given the fact that the HDRTAB extension in the i2d product gives the same BKGLEVEL for both images (are there just 2 input images used in the test?), suggests that those images are identical copies of one another and hence the delta between them is exactly zero. If that's the case (identical input images), it's not a very robust test.
Comment by Tyler Pauly on JIRA:
I edited the snippet above - in that test, there are six input images. It appears to be three integrations each with an nrca5 and an nrcb5 entry.
Comment by Howard Bushouse on JIRA:
Yes, those entries for the new BKGLEVEL values across all 6 images look reasonable. They are 3 pairs of NRCALONG and NRCBLONG exposures, so skymatch groups together the 2 images from modules A and B and derives a common sky value for each pair. That's exactly what we're seeing in the HDRTAB entries: the first pair are now set to zero, and each of the next 2 pairs have common non-zero values.
Comment by Howard Bushouse on JIRA:
Fixed by #6580
Comment by Misty Cracraft on JIRA:
Just adding in a link to the notebook the team used to decide on the method and parameters that Karl stated.
Comment by Anton Koekemoer on JIRA:
thanks Misty Cracraft for confirming that these parameters (skymethod="match", match_down="True", subtract="False") produce the behaviour needed to resolve this ticket.
Issue JP-2024 was created on JIRA by James Davies [X]:
I believe the answer should be "no". But currently
resample
does subtract off the sky background incalwebb_image3
when it has background measurements available. A scalar sky value gets computed earlier on in the stage3 pipeline viaskymatch
.The single-resampled images produced at the end of the stage2 pipeline do not have the background subtracted, as they don't run through
skymatch
.The current implementation checks if
skymatch
has computed a background level for each exposure. If it has, it subtracts it before resampling. So the output resampled images currently have a mean sky background of zero.I suspect what it should do is subtract off those computed sky backgrounds for each exposure, drizzle into the common output image, and the add back the sky background after drizzling is completed, so that the output drizzled image has the same background as the input images.
But which sky background from which input image should be added back in? The mean of the input sky backgrounds? The minimum? Some weighted mean?
I'm primarily concerned about what should be the default for SPD processing of the associations that we will be producing. I suspect all exposures in these associations will be taken close to the same time and should thus have pretty similar zodiacal light characteristics so pretty similar backgrounds.