spacetelescope / drizzlepac

AstroDrizzle for HST images.
https://drizzlepac.readthedocs.io
BSD 3-Clause "New" or "Revised" License
52 stars 38 forks source link

SVM: sky subtraction not consistent #899

Closed stscijgbot-hstdp closed 2 years ago

stscijgbot-hstdp commented 3 years ago

Issue HLA-469 was created on JIRA by Warren Hack:

The sky subtraction code using the 'match' option does not return consistent MDRIZSKY values from one run to another as discovered during full pipeline workflow regression testing. Options need to be determined to see what can be done to implement processing which can result in more consistent values. This will require discussion with M. Cara (developer of the stsci.skypac code being used).

stscijgbot-hstdp commented 3 years ago

Comment by Robert Swaters on JIRA:

In addition to the differences in MDRIZSKY, there can also be differences in pixel values in generated output.

To help identify the cause, I’ve made examples available for the previous two processing runs. These are available in ███████████████████████████████████████████ The FITS files as well as the poller files are available in v2/ and v3/. The processing logs are available in v2/owl_logs/ and v3/owl_logs/. The FITS differences are in fitsdiff/.

Example differences reported:
hst_12443_1j_acs_wfc_f850lp_jboe1jdm_flc.fits_diff (in the DQ mask)
Extension HDU 3:

   Data contains differences:
     Data differs at [355, 3]:
        a> 4112
        b> 16
     ...
     4026 different pixels found (0.05% different).

hst_12443_1j_acs_wfc_f850lp_jboe1j_drc.fits_diff
Extension HDU 1:

   Headers contain differences:
     Keyword MDRIZSKY has different values:
        a> 1.460717522666056
        b> 1.459561894540377

   Data contains differences:
     Data differs at [10, 1]:
        a> 0.013063604
         ?          ^^
        b> 0.013067366
         ?        +  ^
     ...
     15521760 different pixels found (86.02% different).

Example processing logs:
████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████
████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████

Though these differences appear to be well within the noise and not scientifically relevant, they do make comparison of the data against baseline products difficult at best. We rely on these comparisons to verify e.g., that a release candidate and the final version indeed provide the same results. From a testing perspective, it is therefore better to have identical results when the same data is processed twice with the same software and configuration.

stscijgbot-hstdp commented 3 years ago

Comment by Warren Hack on JIRA:

A comparison of results from processing the same data as provided above but on different systems with (nearly) identical software builds gives the following results:

============================================ hst_12443_1j_acs_wfc_f850lp_jboe1j_drc.fits ============================================ From Windows run ----------------- [INFO ] Python Version 3.6.10 |Anaconda, Inc.| (default, May 7 2020, 19:46:08) [MSC v.1916 64 bit (AMD64)] [INFO ] numpy Version -> 1.19.1  [INFO ] astropy Version -> 4.0.1.post1  [INFO ] stwcs Version -> 1.6.0

`MDRIZSKY= 1.460715161319207 / Sky value computed by AstroDrizzle `

From Regression testing ------------------------ a: /ifs/archive/test/hst/pool_four/info/nigel_svm_off_v3/SVM/20201205/hst_12443_1j_acs_wfc_f850lp_jboe1j_drc.fits ` drizzlepac version 3.2.0rc5 tweakwcs version 0.6.5 stwcs version 1.6.0 numpy version 1.19.3`

b: /ifs/archive/test/hst/pool_four/info/nigel_svm_off_v2/SVM/20201203/hst_12443_1j_acs_wfc_f850lp_jboe1j_drc.fits ` drizzlepac version 3.2.0rc4 tweakwcs version 0.6.5 stwcs version 1.6.0 numpy version 1.19.3`

hst_12443_1j_acs_wfc_f850lp_jboe1j_drc.fits_diff Extension HDU 1:

{{ Headers contain differences:}} ` Keyword MDRIZSKY has different values: \ a> 1.460717522666056 \ b> 1.459561894540377`

From workstation ----------------- [INFO ] 3.6.7 | packaged by conda-forge | (default, Feb 20 2019, 02:51:38) [INFO ] numpy Version -> 1.19.4 [INFO ] astropy Version -> 4.1 [INFO ] stwcs Version -> 1.5.4a1.dev82+g5c5131a ` \ MDRIZSKY= 1.460715161319202 / Sky value computed by AstroDrizzle `

The difference in MDRIZSKY results in differences in pixel values, especially for the DQ values for pixels which are borderline CRs (or not).

stscijgbot-hstdp commented 3 years ago

Comment by Robert Swaters on JIRA:

The comparison I reported on above was between DrizzlePac 3.2.0rc4 and DrizzlePac 3.2.0rc5. The difference between these two is in the naming of the static mask.

I have done another comparison between DrizzlePac 3.2.0 (final) and DrizzlePac 3.2.0rc5. That comparison is showing no unexpected differences. The comparisons for the same two files reported above is:

nigel@dmstest4 /ifs/archive/test/hst/pool_four/info/nigel_svm_final/SVM/20201209/fitsdiff 14:23$ more hst_12443_1j_acs_wfc_f850lp_jboe1jdm_flc.fits_diff

 fitsdiff: 3.2.3
 a: /ifs/archive/test/hst/pool_four/info/nigel_svm_final/SVM/20201209/hst_12443_1j_acs_wfc_f850lp_jboe1jdm_flc.fits
 b: /ifs/archive/test/hst/pool_four/info/nigel_svm_off_v3/SVM/20201205/hst_12443_1j_acs_wfc_f850lp_jboe1jdm_flc.fits
 Keyword(s) not to be compared:
  DATE PROCTIME
 Table column(s) not to be compared:
  ADDTIME DATE
 Maximum number of different data values to be reported: 1
 Relative tolerance: 0.0001, Absolute tolerance: 0.0

Primary HDU:

   Headers contain differences:
     Keyword CSYS_VER has different values:
        a> caldp_20201208
        b> caldp_initialSVM

Extension HDU 15:

   Data contains differences:
     Data differs at byte 500:
        a> 57
        b> 52
     ...
     6 different bytes found (0.01% different).

Extension HDU 24:

   Data contains differences:
     Data differs at byte 660:
        a> 57
        b> 52
     ...
     6 different bytes found (0.01% different).

nigel@dmstest4 /ifs/archive/test/hst/pool_four/info/nigel_svm_final/SVM/20201209/fitsdiff 14:25$ more hst_12443_1j_acs_wfc_f850lp_jboe1j_drc.fits_diff

 fitsdiff: 3.2.3
 a: /ifs/archive/test/hst/pool_four/info/nigel_svm_final/SVM/20201209/hst_12443_1j_acs_wfc_f850lp_jboe1j_drc.fits
 b: /ifs/archive/test/hst/pool_four/info/nigel_svm_off_v3/SVM/20201205/hst_12443_1j_acs_wfc_f850lp_jboe1j_drc.fits
 Keyword(s) not to be compared:
  DATE PROCTIME
 Table column(s) not to be compared:
  ADDTIME DATE
 Maximum number of different data values to be reported: 1
 Relative tolerance: 0.0001, Absolute tolerance: 0.0

Primary HDU:

   Headers contain differences:
     Keyword CSYS_VER has different values:
        a> caldp_20201208
        b> caldp_initialSVM
     Keyword HISTORY [535] has different values:
        a>     AstroDrizzle Version 3.2.0
        b>     AstroDrizzle Version 3.2.0rc5
         ?                               +++
     Keyword PROD_VER has different values:
        a> DrizzlePac 3.2.0
        b> DrizzlePac 3.2.0rc5
         ?                 +++

Though this run was not complete (I cut it short because it was for final verification, and there wasn't enough time to let the run complete before the planned Thursday install), there are no reports of MDRIZSKY differences, and no reports of pixel differences. I don't understand the change between the two sets of comparisons.

stscijgbot-hstdp commented 3 years ago

Comment by Robert Swaters on JIRA:

The issue has not recurred in recent tests