Open kadrlica opened 3 years ago
We can certainly integrate this. Another aspect of this issue is updating the Redmine documentation to reflect the slightly adjusted format
@kadrlica Quick question before I start implementing this. Could you provide the lines of code that a user would use to access the IMG_SCALE and IMG_MIN information if they wanted to rescale the pixel values of all the images?
Currently, they would do:
img_arr = hdu_list[2].data
img_min = hdu_list[4].data
img_scale = hdu_list[5].data
recovered_arr = img_arr / 65535 * img_scale[:,:,np.newaxis,np.newaxis] + img_min[:,:,np.newaxis,np.newaxis]
How would this process work with the new (N,M) format you're proposing?
Should be the same. My pyfits
is a bit rusty, but I think it is this:
img_arr = f['IMAGE'].data
img_min = f['INFO'].data['IMG_MIN']
img_scale = f['INFO'].data['IMG_SCALE']
recovered_arr = img_arr / 65535 * img_scale[:,:,np.newaxis,np.newaxis] + img_min[:,:,np.newaxis,np.newaxis]
I'll send you a link to a script on the DES cluster.
Another suggestion that I mentioned on Slack was using the conventional BZERO
, BSCALE
instead of IMG_MIN
and IMG_SCALE
definitions, since the conversion would be more intuitive to the long-time FITS user.
Here is the conversion function that is currently in stack2image.py
def make_bvalues(amin,ascale):
"""Convert stack scale and min values to FITS BSCALE and BZERO.
Parameters
----------
amin : array minimum
ascale : array scale value
Returns
-------
bscale, bzero : FITS header keyword values
"""
bscale0 = 1.0
bzero0 = 32768
amax0 = 65535
bscale = ascale/amax0 * bscale0
bzero = bzero0 * (ascale/amax0) + amin
return bscale, bzero
@kadrlica Regarding the new TableHDU format (I haven't made it to the BZERO BSCALE stuff yet), how is this looking to you?
(des20a) [rmorgan@des81 full_tile_test]$ python
Python 3.7.7 (default, Mar 26 2020, 15:48:22)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from astropy.io import fits
>>> hdu = fits.open('/data/des81.b/data/stronglens/Y6_CUTOUT_IMAGES/DES2228+0001.fits')
>>> hdu.info()
Filename: /data/des81.b/data/stronglens/Y6_CUTOUT_IMAGES/DES2228+0001.fits
No. Name Ver Type Cards Dimensions Format
0 PRIMARY 1 PrimaryHDU 4 ()
1 IMAGE 1 ImageHDU 12 (45, 45, 4, 24845) int16 (rescales to uint16)
2 PSF 1 ImageHDU 16 (25, 25, 4, 24845) int16 (rescales to uint16)
3 INFO 1 BinTableHDU 19 24845R x 5C [14A, 4D, 4D, 4D, 4D]
>>> hdu['INFO'].columns
ColDefs(
name = 'ID'; format = '14A'
name = 'IMG_MIN'; format = '4D'
name = 'IMG_SCALE'; format = '4D'
name = 'PSF_MIN'; format = '4D'
name = 'PSF_SCALE'; format = '4D'
)
>>>
Actually I think I answered my own question by looking in the directory you sent me and comparing to the files you produced. Looks like a match!
Regarding switching to signed integers for the pixel values in the IMG and PSF HDUs and using BSCALE and BZERO for the recovery information (now that I've had a chance to look into it), my personal preference is to keep things as they are. We certainly could make the change, but I am leaning more towards the 0-65535 pixel values as opposed to the negative pixel values, since the 0-65535 pixel values will be aligned with what deep learning analyses are expecting. I'm happy to discuss the benefits / drawbacks in more depth though!
I think the only advantage is that the unwashed masses will "intuitively" understand what BSCALE and BZERO mean. Since these stacks are designed for ML, I'm fine keeping it that way.
That's was my understanding too. I think using different keywords like IMG_SCALE and IMG_MIN will make people aware of the scaling being done differently. I could write up an example for the Redmine page to show how to use the stacks from a BZERO / BSCALE approach (I'll be updating it anyways so it's no trouble)
Summarizing our conversations on Slack in preparation for an eventual PR...
Currently we store metadata in several
(N,M)
arrays stored asIMAGE_HDUs
In addition, we have one TableHDU
There was a proposal to combine these into one TableHDU called
INFO
or something that contained(N,M)
columns forIMG_MIN
,IMG_SCALE
,PSF_MIN
, andPSF_SCALE
. This would be similar in format to how MOF/SOF band-by-band information is stored. I argue that it would be more natural to the user to find this metadata in a single TableHDU rather than separate ImageHDUs. I've written a converter script, but it would be better to integrate this natively.