Add support to calwf3 for Full Well Saturation Maps (SATUFILE)

mackjenn commented 3 years ago

Similar to changes recently implemented for calacs (acsccd), the WFC3 branch would like to update calwf3 (wf3ccd) to flag Full-well saturation (currently part of DQICORR) using an image rather than a single scalar value.

For reference see the hstcal issue and associated pull request: https://github.com/spacetelescope/hstcal/issues/453 https://github.com/spacetelescope/hstcal/pull/539

The CCDTAB uses a single value of full-well saturation for data quality flagging for ALL amplifiers, whereas studies have shown that the full-well spatially varies across each amplifier region. The new reference file SATUFILE is 2D full-well saturation image and will be used to detect and flag pixels versus using the simple scalar value defined in the CCDTAB.

This check would happen at the end of BIASCORR, where the SATUFILE reference file will be in units of electrons and will have calibrated image dimensions after overscan trimming has occurred. The WFC3 Branch will work with CRDS/ReDCaT on implementing the new reference file.

The goal is to implement initial functionality by the end of FY2021 Q2 (March 2021) if possible, but the timeline is flexible depending on the availability of @mdlpstsci .

mdlpstsci commented 2 years ago

Sent note to Jennifer and Sylvia regarding a timeframe for their creation of the new saturation image.

mackjenn commented 2 years ago

A test version of the UVIS satufile for calwf3 development may be found here: /user/mack/uvis_satufile/wfc3_uvis_sat.fits

mdlpstsci commented 2 years ago

@mackjenn Thanks. I got the test version of the satufile. Are there binned WFC3 images?

mdlpstsci commented 2 years ago

@mackjenn Ooops. I finally read your email regarding eventually making files for binned images. I have my answer!

mackjenn commented 2 years ago

@mdlpstsci Yes, UVIS has binned modes, and we'll need to make extra saturation reffiles and add some keywords to the file we just gave you. For reference, here's an example of a 1x1, 2x2, and 3x3 binned FLSHFILE in /grp/hst/cdbs/iref/ and corresponding keywords for the two CCDs. Note the binning is done on-orbit before readout.

Hopefully you can use the same calwf3 logic for the binned SATUFILE as for the binned FLSHFILE

FILENAME  CCDCHIP BINAXIS1 BINAXIS2 NAXIS1 NAXIS2 SIZAXIS1 SIZAXIS2 CENTERA1 CENTERA2 CRPIX1 CRPIX2 LTV1   LTV2   LTM1_1 LTM2_2
w7j17058i_fls   2        1        1   4206   2070   4206   2070     2104    1036      0.0    0.0    25.0   0.0    1.0    1.0
w7j17050i_fls   2        2        2   2102   1035   4206   2070     2104    1036      0.0    0.0    12.75  0.25   0.5    0.5
w7j17053i_fls   2        3        3   1402    690   4206   2070     2104    1036      0.0    0.0    8.666  0.333  0.333  0.333
w7j17058i_fls   1        1        1   4206   2070   4206   2070     2104    1036      0.0    0.0    25.0   19.0   1.0    1.0
w7j17050i_fls   1        2        2   2102   1035   4206   2070     2104    1036      0.0    0.0    12.75  9.75   0.5    0.5
w7j17053i_fls   1        3        3   1402    690   4206   2070     2104    1036      0.0    0.0    8.666  6.666  0.333  0.333

See IHB Section 5.4: The maximum full well depth of the pixels on the WFC3/UVIS chips is ~72500 e– (see IHB Figure 5.6). For the default gain ~1.5 e–/DN, this corresponds to ~48000 DN, well below the ADC limit of 65,535 DN. If the pixels are binned 2×2 or 3×3, the binned pixels could reach flux levels of 4 or 9 times 48,000 DN, respectively, which would be truncated to 65,535 DN (*1.5= ~98,300 e–) during readout.

mdlpstsci commented 2 years ago

@mackjenn I have done testing with the modified algorithm on full-frame data, as well as subarrays(Amps A, B, C, D, and A straddle - I could not seem to find an Amp C staddle dataset). I modified some of the data to force high signal values for testing. Finally, all datasets for comparison have been manually reprocessed by me. I do see differences in the datasets when comparing data processed with the new code and without the new code. Some differences are expected as the full-well saturation flags are applied in different places in the calibration code, as well as pixels formerly having DQ values of both 256 and 2048 will now only have a DQ value with 2048. However, INS will have to judge if any differences seen in SCI and ERR are appropriate.

I am contacting you at this time to find out if you have a 2x2 and 3x3 calibration file for testing.

mdlpstsci commented 2 years ago

The following datasets (1x1 binning) have been used thus far to test the new algorithm:

Filename FF/Sub Amp BiaslevelA BiaslevelB BiaslevelC BiaslevelD CCDGain f: ib2j03q7q F amp: ABCD leva: 2556.063 levb:2543.1011 levc:2503.5403 levd: 2605.5229 gain: 1.5 f: ibc301s2q F amp: ABCD leva: 2554.7791 levb:2541.5054 levc:2500.9248 levd: 2602.5251 gain: 1.5 f: ic5p02ejq S amp: B leva: 0.0 levb:2543.8 levc:0.0 levd: 0.0 gain: 1.5 f: ic5p02e6q S amp: A leva: 2556.3999 levb:0.0 levc:0.0 levd: 0.0 gain: 1.5 f: ibwn02vcq S amp: C leva: 0.0 levb:0.0 levc:2503.3 levd: 0.0 gain: 1.5 f: ic5p02e0q S amp: A leva: 2556.3999 levb:0.0 levc:0.0 levd: 0.0 gain: 1.5 f: ics482anq S amp: C leva: 0.0 levb:0.0 levc:2503.3 levd: 0.0 gain: 1.5 f:Xiaaua1leq S amp: C leva: 0.0 levb:0.0 levc:2501.4404 levd: 0.0 gain: 1.5 f: ibbr01zmq S amp: D leva: 0.0 levb:0.0 levc:0.0 levd: 2602.5273 gain: 1.5 f: ibxb90dfq S amp: C leva: 0.0 levb:0.0 levc:2503.3 levd: 0.0 gain: 1.5

When comparing newly processed data using the current software and the new software, there were the expected differences in the DQ arrays. In addition, some files had differences in some keyword values:

Primary: cal_ver

SCI and ERR: goodmax, goodmean, goodmin, ngoodpix, snrmax, snrmean, snrmin

Only dataset iaaua1leq had differences in SCI and ERR in addition to the DQ differences.

mackjenn commented 2 years ago

@mdlpstsci We've created 3 new reffiles here, which correspond to the different UVIS binning options:

/grp/hst/wfc3k/mack/satufile_uvis/

wfc3_uvis_sat_1x1_verbose.fits
wfc3_uvis_sat_2x2_verbose.fits
wfc3_uvis_sat_3x3_verbose.fits

These have a suffix 'verbose' since their headers were copied from FLSHFILEs but not stripped of extraneous header keywords. I believe they will work okay with the code, but please let me know if the 1x1 binned file gives any differences with respect to your prior test using 'wfc3_uvis_sat.fits'. (I think it won't).

From your comments above, it looks like you were able to find amp C data. We are happy to provide you dataset names for various test cases, if needed. (There are fewer binned images, so we may not have all amps for those.)

mdlpstsci commented 2 years ago

@mackjenn I do not have permission to access below /grp/hst.

mackjenn commented 2 years ago

@mdlpstsci Oops! I copied them here: /user/mack/uvis_satufile/

mdlpstsci commented 2 years ago

@mackjenn While I coded the saturation image into the CALWF3 pipeline a while ago, I finally had some uninterrupted time to test and evaluate the status of the code, and thereby, the status of the saturation image files. I have some comments and questions for you.

1) The new wfc3_uvis_sat_1x1_verbose.fits file has an incorrect LTM1_1 value of "0" when it should be "1". This must have been a typo as the original wfc3_uvis_sat.fits file was correct.

2) Your current "verbose" saturation data [SCI,1] or [SCI,2] are in units of DN. This is fine, but I will note the ACS saturation images are in electrons. ~Recall the saturation image is being applied after BLEVCORR/BIASCORR where the science data has already be converted to electrons.~

~I can multiply by the gain, but you may want to consider having your reference file have the gain built in. If it helps, I will note for the DOBIAS step, CALWF3 assumes the bias image is already scaled by the gain.~

3) The structure of the various binned saturation FITS files is inconsistent, and they should be made consistent.

4) The actual ACS saturation data is embedded in an image of full-frame dimensions. The portions of the image which do not have saturation data contain zero. This is also what you have done for your data. This is fine for the overscan data along the outer boundary of the true illuminated data (top, bottom, left-side, and right-side) of each chip as these are ignored during processing and then trimmed on output.

However, WFC3 data differs from ACS data in that WFC3 includes the serial virtual overscan columns in the data between Amps A and B or C and D. Since there is no saturation information for this region, the values are currently all set to zero. I can: (a) skip over this region when flagging for full-well saturation, or (b) include this region, but the values could no longer be zero as this would flag all pixels in this region with 256, or (c) include this region, but the values would be set to a very high floating point value or the same value as the remainder of the saturation image.

The issue is the current version of CALWF3 does NOT skip this region when flagging for A-to-D (2048) or Full-well (256) saturation during the DODQI step. As such, WFC3 FLT data has flags of 256 in this region. None of the proposed schemes (a - c) will flag the corresponding pixels and only the corresponding pixels in a comparable way when applying the saturation map. What would you like to do? Perhaps this is not a problem from your perspective, and so I should just skip processing these columns???

These are all the comments/questions I have at this time. Thanks, Michele

mackjenn commented 2 years ago

Michele, Thanks for these detailed notes. This is super helpful and we will work to address the issues with the FITS files. Regarding the gain, I'll need to think about this and get back to you. -Jennifer

mdlpstsci commented 2 years ago

@mackjenn I have crossed out some incorrect comment that I made in the text above for Item 2 while looking up information for our conversation. The WFC3 *blv/blctmp.fits files are still in counts! The wf3ccd processes everything in counts_. If the calibration reference file is in units of electrons when used during the wf3ccd processing, the calibration data are divided by the gain before use. The conversion to electrons happens in the wf32d component.

Given the above information, the application of a saturation image in DN does not have to happen after BIASCORR and BLEVCORR. This correction could happen in doDQI or in a function right after doDQI.

mackjenn commented 2 years ago

@mdlpstsci Thanks for the helpful discussion today. To summarize, we will:

Fix the LTM1_1 keyword value; Strip out any extraneous header keywords for clarity
Change the reference file units to electrons for consistency with other reference files applied at this stage of calwf3 (e.g. SINKFILE, FLSHFILE). This also gives consistency with ACS SATUFILE and with the full well maps in ISR 2010-10. Since the observation will still be in units of DN, calwf3 will correct the SATUFILE to units of DN before populating DQ flags. I checked some trailer files and it looks like the pipeline applies a gain value of 1.560000 to all amps, so we will apply this. (CAN YOU VERIFY THIS VALUE?) I'm not sure if we will have to account for the fact that the saturation flags are now applied after the BIAS subtraction, which is ~2550 DN but is slightly different for each amp. We will test this.

Remake the 1x1, 2x2, and 3x3 binned files with consistent file structure, using the format of the current bias reference files (below). For your reference, these have overscan regions as defined in the attached document. WFC3-TIR-2018-03.pdf

Filename: /grp/hst/cdbs/iref/5641723bi_bia.fits
No.    Name      Ver    Type      Cards   Dimensions   Format
0  PRIMARY       1 PrimaryHDU     262   ()      
1  SCI           1 ImageHDU        35   (4206, 2070)   float32   
2  ERR           1 ImageHDU        35   (4206, 2070)   float32   
3  DQ            1 ImageHDU        35   (4206, 2070)   int16   
4  SCI           2 ImageHDU        35   (4206, 2070)   float32   
5  ERR           2 ImageHDU        35   (4206, 2070)   float32   
6  DQ            2 ImageHDU        35   (4206, 2070)   int16   
Filename: /grp/hst/cdbs/iref/56417101i_bia.fits
No.    Name      Ver    Type      Cards   Dimensions   Format
0  PRIMARY       1 PrimaryHDU     285   ()      
1  SCI           1 ImageHDU        21   (2102, 1035)   float32   
2  ERR           1 ImageHDU        20   (2102, 1035)   float32   
3  DQ            1 ImageHDU        20   (2102, 1035)   int64   
4  SCI           2 ImageHDU        21   (2102, 1035)   float32   
5  ERR           2 ImageHDU        20   (2102, 1035)   float32   
6  DQ            2 ImageHDU        20   (2102, 1035)   int64   
Filename: /grp/hst/cdbs/iref/56417095i_bia.fits
No.    Name      Ver    Type      Cards   Dimensions   Format
0  PRIMARY       1 PrimaryHDU     285   ()      
1  SCI           1 ImageHDU        35   (1402, 690)   float32   
2  ERR           1 ImageHDU        35   (1402, 690)   float32   
3  DQ            1 ImageHDU        35   (1402, 690)   int64   
4  SCI           2 ImageHDU        35   (1402, 690)   float32   
5  ERR           2 ImageHDU        35   (1402, 690)   float32   
6  DQ            2 ImageHDU        35   (1402, 690)   int64

For the serial overscan, please have calwf3 skip over these regions. We want to keep this reference file consistent with the other 'untrimmed' reference files at this stage (eg. BIASFILE, FLSHFILE), which have these overscan regions filled with zeros.
The SATUFILE will be applied after BIASCORR and before the SINKFILE flagging. If possible it would make sense to have this be part of DQICORR (along with the sink pixels), rather than making a separate CAL switch (like ACS did). Let us know if this sounds ok.

As we discussed, you will update wfc3tools.readthedocs to better describe the actual behavior of calwf3 (e.g. the gain is not applied in wf3ccd as stated in several places). Also, the SINK pixel flagging happens after BIASCORR when DQICORR is invoked for a second time. Finally, the units of the certain reference files applied in wf3ccd are in electrons, so calwf3 applies the gain to these. It would be good to mention this in the documentation. We'll be happy to look this over once you have an initial draft.

Thanks again!!

mdlpstsci commented 2 years ago

@mackjenn To avoid scope creep, the update to the wfc3tools.readthedocs will be done as a separate ticket. Git Issue #67 of the spacetelescope/wfc3tools repo.

To clarify, the saturation map was applied in the calacs pipeline after BIASCORR and BLEVCORR as the ACS team specified. The application of the saturation map was not triggered by a new keyword or switch. Rather, if and only if the BIASCORR and BLEVCORR are performed, then the saturation map will be applied. From what you have documented in the above comment, you effectively want the same methodology of "triggering" of the saturation map to happen in calwf3. More specifically, if and only if DQICORR is set to perform, then not only will the SINK pixel flagging happen, but also the saturation full-well flagging will also happen.

Please note I have also explicitly confirmed WVFC3 UVIS data is converted to electrons during the FLATCORR step.

mackjenn commented 2 years ago

@mdlpstsci The new files for testing in units of electrons are here: /grp/hst/wfc3k/mack/satufile_uvis/ wfc3_uvis_sat_1x1_v2.fits wfc3_uvis_sat_2x2_v2.fits wfc3_uvis_sat_3x3_v2.fits

Can you verify that the precise gain value applied by calwf3 is 1.56 e/DN, as written in the trailer files?

Thanks for opening a separate ticket regarding the documentation. I have acquired a copy of the calwf3 flowchart diagram and will modify accordingly and upload a new version to that ticket once approved by the branch.

mdlpstsci commented 2 years ago

@mackjenn

Yes, the gain value in actual use is 1.56.
I do not have permission to access the new saturation files. It looks like I can only get as far as /grp/hst.
As for the flowchart - cool.

mackjenn commented 2 years ago

Oops! I keep sending the wrong directory. They are here: /user/mack/uvis_satufile/

mdlpstsci commented 2 years ago

@mackjenn I have one more item to clarify. When WFC3 data is finally converted to electrons in the FLATCORR stage of CALWF3, the code uses the mean gain for all four amps for every pixel. This is in contrast to using the gain for the specific amp corresponding to the pixels. For flagging the pixels, I was going to use the applicable gain for the specific amp, but then I thought I should check with you.

mackjenn commented 2 years ago

Hi @mdlpstsci ,

I believe the amp-dependent gain is incorporated as part of the flat field science array. As a validation, when I look at the saturation map (in electrons) derived from calibrated FLT data, there are no quadrant boundaries. Thus, I think you can just use the mean gain value.

However, you may be right that we need to account for this. In that case, we can adjust our SATUFILE map by the amp-dependent gain and validate through testing.

A question for you... Does calwf3 process data with non-default gain values? I'm not sure if there is archival data with Gain=1.0, 2.0, or 4.0 that needs corresponding SATUFILEs.

mdlpstsci commented 2 years ago

@mackjenn The ccdtab file contains the correspondence between the commanded and actual values. The commanded amp, gain, bias, chip, offset, and bin sizes, as read from the science image header, are used as keys into the ccdtab information. Once the proper row is selected, the actual gain of amps 1-4 is extracted from the table for use in processing the specific dataset. There are entries in ccdtab to support gains of 1.0, 1.5, 2.0, and 4.0. CALWF3 is agnostic, by design, to the gain value. I am trying to search the on-line cache now to see if there are datasets with gains of 1.0, 2.0, or 4.0.

mdlpstsci commented 2 years ago

Searching the on-line cache was not working out. I asked Matt who has access to a database to find out for me about WFC3 gains. I forgot to tell him only for UVIS as the gain=2.5 looks like IR. It looks like IR can have commanded gains of 2.0, 2.5, 3.0, and 4.0.

+-------------+-------+ | w3r_ccdgain | count | +-------------+-------+ | 1 | 2
| 1.5 | 91002 | 2 | 148
| 2.5 | 92035 | 4 | 2

mackjenn commented 2 years ago

@mdlpstsci Are you sure the amp-dependent gain is applied after FLATCORR? Can you do a quick check with values of 1.0 in each amp and see what happens?

While the CCDTAB lists gain values for each amp, both Sylvia and I were under the impression that only the average gain value is used and that the flats account for the amp-dependent ratios.

For now, lets just stick with a single gain when converting the SATUFILE from electrons to DN. We can test this later with photometry. Also for now, lets ignore non-default gain and we can add later if needed.

mdlpstsci commented 2 years ago

@mackjenn

(1) The gain is applied during the FLATCORR step. As of 2009/2010, the mean gain of all amps is used when applying the gain correction during FLATCORR. If you and Sylvia think something different happens, please let me know so that I can verify.

(2) I have processed data with the new code using a mean gain value of 1.0. As expected there are differences in the output files from the nominal gain in use of 1.56. Is there something specific you want me to check?

(3) I have implemented the use of the mean_gain for converting the SATUFILE back into DN for its application to the data.

(4) As for images with non-default gain values... I am not sure in detail how this works, but I imagine this process is dependent upon how you set up the rules with CRDS for selection of the new SATUFILE to populate the headers of the RAW files. The SATUFILE file chosen will be based upon whatever criteria has been set up for selection. As long as the gain is in the CCDTAB file, then the mean gain will be used to convert the SATUFILE values into DN for application.

mdlpstsci commented 2 years ago

@mackjenn Part of my testing is to understand the structure of your new SATUFILE calibration images. I see that you have made the corrections in terms of the header keywords and file structure which is great. However, I do have some additional observations regarding these files which I would like to verify.

Background It is understood the actual saturation map dimensions are a subset of the image extension dimensions in the FITS file. The actual saturation map portion will be set to the full-well saturation values derived by the team, and the remaining portion of the image is set to 0.0. This means the various overscan regions are set to 0.0 and will be ignored during the process to flag pixels which exceed the full-well saturation value.

1) For the full-frame datasets (all bin sizes), the saturation map has fewer rows than that of the corresponding science image. For example, the 1x1 saturation map for Chip 2 has only rows 1 - 2048, for columns 1 - 25, populated with the value of 71760, leaving rows 2049 - 2070 set to 0.0. Please note row and column numbers are 1-based in this discussion. However, the science data actually populates rows 1 - 2051 which means rows 2049, 2050, and 2051 will be flagged with 256 since the valid science data exceeds the value of 0.0.

Note that I am using this portion of the saturation map and the science image for illustration. To put it more plainly, the saturation map area with pixel values of 71760 should have the same dimensions as the trimmed science image for both chips. For all binning cases, the saturation map has fewer rows than the corresponding science images.

Just in case I misunderstand, is this what you want?

2) The saturation map has "holes" which are set to 0.0. This means that science data will be flagged where these holes are present. I had not expected holes.

Is this what you want?

mackjenn commented 2 years ago

Oops, didn't mean to close this.

mdlpstsci commented 2 years ago

@mackjenn Responses to your latest questions, please see my comments above from two days ago. The first comment contains my responses and a question. My second comment has to do with the saturation file construction. I wish the comments were dated rather than relative (e.g., n days ago).

mdlpstsci commented 2 years ago

@mackjenn From my comment of two days ago, Item 1... Looking at a BIAS reference file and comparing it to the SAT file, I see that there are 22 rows of zero (rows=2049-2070) at the top of Chip 2, as well as at the bottom (rows=1-22) of Chip 1. Indeed, the "active image area" for these reference files has fewer rows than the SCI images. This is different than the FLS file which only has 19 rows of zero (rows=2052-2070) at the top of Chip 2, as well as at the botoom (rows=1-19) of Chip 1.

I am just checking to make sure this is what you want.

mackjenn commented 2 years ago

Thanks for your careful checks and for uncovering these errors. Below are comments on several topics:

Empty rows: Thanks for pointing out the new issue with the rows near the chip gap. It turns out bias files are empty from rows 2049-2070, whereas the flashfile is only empty from 2052-2070. (I've asked the team why this might be... ). In the meantime we will fix this in the SATUFILE.

Holes: We didn't realize they were here and will fix this too. Thanks.

Gain: I may have misunderstood what you said about the CCDTAB reading and applying amp-dependent values. To avoid further confusion, we'll just agree that calwf3 will apply a constant gain value for all amps when converting the SATUFILE to DN. If needed, we will incorporate gain-offsets in the SATUFILE, if needed, to achieve the optimal DQ flagging in flat fielded stellar observations.

Binning: Can you point us to a binned science image so we can match the dimensions of the 2x2 and 3x3 science arrays? Again, we copied the BIASFILEs which were missing 3 rows of data for each chip, so it makes sense that we still have issues with the overscan.

Thanks for your patience as we work through these items. :)

mdlpstsci commented 2 years ago

@mackjenn First, I am happy to help. I always assume I have done something not quite right before I bother the team! (Even then I may not catch everything.)

Empty rows: There are three empty rows for both Chip 1 and Chip 2. I only discussed Chip 2.

Holes: I did not think you wanted these.

Gain: Yes, I am just using the mean_gain for all the amps.

Binning: Here are some binned datasets. 3x3 iaao11odq_flt.fits (bias and problem image) and iacs02trq_flt.fits (flat) 2x2 ibtq25q9q_flt.fits and icu504e0q_flt.fits

mdlpstsci commented 2 years ago

@mackjenn I am going to give your team access to the executable (and code too) of the updates made to calwf3 to accommodate support for a saturation image so you can test once you have the updated reference files prepared. As such, I do have one item to confirm with you before I setup the software for your use.

Since there is no "SATCORR" calibration step switch, the saturation image is currently applied if and only if both BIASCORR and BLEVCORR are set to PERFORM. This is how the ACS folks wanted the logic set up. However, WFC3 does not have the ACS restrictions. Do you want the new saturation flagging to be done regardless of any other calibration step switches?

I should add that I am going on vacation next week, so I do not want to be the bottleneck in your being able to test.

mackjenn commented 2 years ago

@mdlpstsci The ACS logic for the masking sounds good.

Thanks for offering to share the calwf3 executable. We'll make those fixes to the reference files and then run some tests while you're away to verify that we've got it working. Much appreciated!

mdlpstsci commented 2 years ago

@mackjenn I am sending instructions on how to access the CALWF3 executable via email.

mdlpstsci commented 1 year ago

Resolved by PR #563 #585 #586

spacetelescope / hstcal

Add support to calwf3 for Full Well Saturation Maps (SATUFILE) #558