spacetelescope / jwst

Python library for science observations from the James Webb Space Telescope
https://jwst-pipeline.readthedocs.io/en/latest/
Other
571 stars 167 forks source link

HDRTAB in a DataModel is only an numpy array #5001

Open stscijgbot-jp opened 4 years ago

stscijgbot-jp commented 4 years ago

Issue JP-1479 was created on JIRA by Jonathan Eisenhamer:

When reading in a file that has the HRDTAB extension, the resulting item in the model is an numpy.ndarray. Expectation would be an actual astropy.table.Table

jdavies-st commented 4 years ago

Related to

https://github.com/spacetelescope/jwst/issues/3092

stscijgbot-jp commented 8 months ago

Comment by David Law on JIRA:

Setting to Trivial priority, up to Tyler Pauly whether or not to close.

stscijgbot-jp commented 8 months ago

Comment by Tyler Pauly on JIRA:

I guess I don't know the use case where a HDRTAB is present - Jonathan Eisenhamer do you know/remember when this would be present? I think this could be work for stdatamodels, but I don't know where this pops up.

stscijgbot-jp commented 8 months ago

Comment by Howard Bushouse on JIRA:

A HDRTAB is a FITS table extension that gets added to all combined products (e.g. i2d, s2d, s3d, etc.) coming out of level-3 pipelines. They contain lists of header keyword values from all the input exposures that went into making a combined product. The table is created via the fitsblender (or is it now called modelblender) utility.

jdavies-st commented 8 months ago

HDRTAB is the result of blending together header keywords when one has a resampled level 3 _i2d (for example) image that had lots of inputs (_cal) that make it up. The HDRTAB is a FITS binary table extension that has a row for each input and a column for each FITS header keyword and what they were in the input files.

It's not clear to me we should even be producing these, because lineage of input data is specified by association tables, and the input _cal files are listed. The blending of headers is also probematic - DETECTOR is meaningless in a NIRCam _i2d file since there are always at least 2 and sometimes 8 detectors represented in a single _i2d file.

I never understood the need for complicated blending rules, except for a single rule were if an attribute amongst the input files is not the same in all inputs, it gets removed from the output file (set to None). And a HDRTAB needs not be produced, as it mostly has redundant data in it that is already in the _i2d header for the relevant keywords.

FWIW, the code that produces these is called from resample and it lives here

https://github.com/spacetelescope/jwst/tree/master/jwst/model_blender

And given the test coverage

---------- coverage: platform darwin, python 3.11.7-final-0 ----------
Name                               Stmts   Miss  Cover
------------------------------------------------------
jwst/model_blender/__init__.py         0      0   100%
jwst/model_blender/blender.py         88     19    78%
jwst/model_blender/blendmeta.py      121     82    32%
jwst/model_blender/blendrules.py     232     86    63%
jwst/model_blender/textutil.py        15      0   100%
------------------------------------------------------
TOTAL                                456    187    59%

is it worth keeping?

All that said, you can read the tables like so:

In [1]: from jwst import datamodels

In [2]: from astropy.table import Table

In [3]: i2d = datamodels.open("jw01345-o004_t024_nircam_clear-f444w_i2d.fits")

In [5]: t = Table(i2d.hdrtab)

In [6]: t
Out[6]: 
<Table length=6>
          DATE          ORIGIN TIMESYS ...       RA_REF           DEC_REF           ROLL_REF    
         str23           str5    str3  ...      float64           float64           float64     
----------------------- ------ ------- ... ----------------- ----------------- -----------------
2022-06-22T14:43:27.026  STSCI     UTC ... 214.8289084977943 52.81412783319971 130.5877689552636
2022-06-22T14:43:28.348  STSCI     UTC ... 214.8824998985255 52.85064125318538 130.6303041303303
2022-06-22T14:43:29.761  STSCI     UTC ... 214.8817388805081 52.85081805946688 130.6295933280304
2022-06-22T14:43:31.304  STSCI     UTC ... 214.8821191880491  52.8507294091914  130.630054748654
2022-06-22T14:43:32.761  STSCI     UTC ... 214.8292890563504 52.81403981685559 130.5880175248571
2022-06-22T14:43:34.268  STSCI     UTC ... 214.8285285696465 52.81421622465611 130.5873077867482

In [9]: t['FILENAME']
Out[9]: 
<Column name='FILENAME' dtype='str48' length=6>
jw01345004001_08201_00001_nrcalong_o004_crf.fits
jw01345004001_08201_00002_nrcblong_o004_crf.fits
jw01345004001_08201_00003_nrcblong_o004_crf.fits
jw01345004001_08201_00001_nrcblong_o004_crf.fits
jw01345004001_08201_00002_nrcalong_o004_crf.fits
jw01345004001_08201_00003_nrcalong_o004_crf.fits

In [7]: t.columns
Out[7]: <TableColumns names=('DATE','ORIGIN','TIMESYS','FILENAME','SDP_VER','PRD_VER','OSS_VER','CAL_VER','CAL_VCS','DATAMODL','TELESCOP','HGA_MOVE','HGA_STRT','HGA_STOP','PWFSEET','NWFSEST','ASNPOOL','ASNTABLE','TITLE','PI_NAME','CATEGORY','SUBCAT','SCICAT','CONT_ID','DATE-OBS','TIME-OBS','DATE-BEG','DATE-END','OBS_ID','VISIT_ID','PROGRAM','OBSERVTN','VISIT','VISITGRP','SEQ_ID','ACT_ID','EXPOSURE','BKGDTARG','TEMPLATE','OBSLABEL','ENG_QUAL','ENGQLPTG','VISITYPE','VSTSTART','VISITSTA','NEXPOSUR','INTARGET','TARGOOPP','TSOVISIT','EXP_ONLY','TARGPROP','TARGNAME','TARGTYPE','TARG_RA','TARG_DEC','TARGURA','TARGUDEC','MU_RA','MU_DEC','MU_EPOCH','PROP_RA','PROP_DEC','SRCTYAPT','INSTRUME','DETECTOR','MODULE','CHANNEL','FILTER','PUPIL','PILIN','GRATING','BAND','FXD_SLIT','FOCUSPOS','PREIMAGE','CCCSTATE','CORONMSK','MSASTATE','MSAMETFL','MSAMETID','MSACONID','LAMP','OPMODE','GWA_XTIL','GWA_YTIL','GWA_XP_V','GWA_YP_V','GWA_PXAV','GWA_PYAV','GWA_TILT','FWCPOS','PWCPOS','EXPCOUNT','EXPRIPAR','EXP_TYPE','EXPSTART','MJD-BEG','EXPMID','MJD-AVG','EXPEND','MJD-END','TDB-BEG','TDB-MID','TDB-END','READPATT','EXSEGNUM','EXSEGTOT','NOUTPUTS','NINTS','INTSTART','INTEND','NGROUPS','NFRAMES','MIRNGRPS','MIRNFRMS','FRMDIVSR','GROUPGAP','DRPFRMS1','DRPFRMS3','NSAMPLES','TSAMPLE','TFRAME','TGROUP','EFFINTTM','EFFEXPTM','XPOSURE','DURATION','TELAPSE','NRSTSTRT','NRESETS','ZEROFRAM','DATAPROB','SCA_NUM','DATAMODE','NRS_NORM','NRS_REF','SCTARATE','GAINFACT','IS_IMPRT','IS_PSF','SELFREF','SUBARRAY','SUBSTRT1','SUBSTRT2','SUBSIZE1','SUBSIZE2','FASTAXIS','SLOWAXIS','DETXCOR','DETYCOR','DETXSIZ','DETYSIZ','PATTTYPE','PRIDTYPE','PRIDTPTS','PATT_NUM','PATTSTRT','NUMDTHPT','PATTNPTS','NRIMDTPT','NOD_TYPE','PATTSIZE','SMGRDPAT','SUBPXPTS','SUBPXPAT','SPEC_NUM','SPECNSTP','SPECSTEP','SPCOFFST','SPAT_NUM','SPATNSTP','SPATSTEP','SPTOFFST','XOFFSET','YOFFSET','REFFRAME','EPH_TYPE','EPH_TIME','JWST_X','JWST_Y','JWST_Z','OBSGEO-X','OBSGEO-Y','OBSGEO-Z','JWST_DX','JWST_DY','JWST_DZ','OBSGEODX','OBSGEODY','OBSGEODZ','APERNAME','PPS_APER','PA_APER','VA_RA','VA_DEC','VA_SCALE','BARTDELT','BSTRTIME','BENDTIME','BMIDTIME','HELIDELT','HSTRTIME','HENDTIME','HMIDTIME','FAM_LA1','FASTEP1','FAUNIT1','FAPHASE1','FA1VALUE','FAM_LA2','FASTEP2','FAUNIT2','FAPHASE2','FA2VALUE','FAM_LA3','FASTEP3','FAUNIT3','FAPHASE3','FA3VALUE','RMA_POS','FCSRLPOS','GS_ORDER','GSSTRTTM','GSENDTIM','GDSTARID','GS_RA','GS_DEC','GS_URA','GS_UDEC','GS_MAG','GS_UMAG','GS_V3_PA','PCS_MODE','VISITEND','GSACSTAT','GSCENTX','GSCENTY','GS_EPOCH','GS_MURA','GS_MUDEC','GS_PARA','CRDS_VER','CRDS_CTX','R_AREA','R_MSAOPE','R_BARSHA','R_CAMERA','R_COLLIM','R_CUBPAR','R_DARK','R_DISPER','R_DISTOR','R_DRZPAR','R_EXTR1D','R_FILOFF','R_FLAT','R_DFLAT','R_FFLAT','R_SFLAT','R_FORE','R_FPA','R_FRINGE','R_GAIN','R_IFUFOR','R_IFUPOS','R_IFUSLI','R_IPC','R_LASTFR','R_LINEAR','R_MASK','R_MSA','R_OTE','R_PTHLOS','R_PERSAT','R_PHOTOM','R_PSFMAS','R_READNO','R_REFPIX','R_REGION','R_RESAMP','R_RESOL','R_RESET','R_RSCD','R_SATURA','R_SPKERN','R_SPPROF','R_SPTRAC','R_SPCWCS','R_STRAY','R_SUPERB','R_THRPUT','R_TRPDEN','R_TRPPAR','R_TSPHOT','R_V2V3','R_WAVCOR','R_WAVRAN','R_WAVMAP','S_PSFALI','S_AMIANA','S_AMIAVG','S_AMINOR','S_WCS','S_MTWCS','S_BKDSUB','S_BARSHA','S_COMB1D','S_IFUCUB','S_DARK','S_DQINIT','S_TELEMI','S_ERRINI','S_EXTR1D','S_EXTR2D','S_FRSTFR','S_FLAT','S_FRINGE','S_GANSCL','S_GRPSCL','S_GUICDS','S_IMPRNT','S_IPC','S_JUMP','S_KLIP','S_LASTFR','S_LINEAR','S_MSBSUB','S_MRSMAT','S_MSAFLG','S_OUTLIR','S_PTHLOS','S_PERSIS','S_PHOTOM','S_RAMP','S_REFPIX','S_RESAMP','S_RESET','S_RSCD','S_SATURA','S_SKYMAT','S_SRCCAT','S_SRCTYP','S_PSFSTK','S_STRAY','S_SUPERB','S_TSPHOT','S_TWKREG','S_WAVCOR','S_WFSCOM','S_WHTLIT','BKGMETH','BKGLEVEL','BKGSUB','BUNIT','PHOTMJSR','PHOTUJA2','PIXAR_SR','PIXAR_A2','RADESYS','RA_V1','DEC_V1','PA_V3','WCSAXES','CRPIX1','CRPIX2','CRPIX3','CRVAL1','CRVAL2','CRVAL3','CTYPE1','CTYPE2','CTYPE3','CUNIT1','CUNIT2','CUNIT3','CDELT1','CDELT2','CDELT3','PC1_1','PC1_2','PC1_3','PC2_1','PC2_2','PC2_3','PC3_1','PC3_2','PC3_3','CD1_1','CD1_2','CD2_1','CD2_2','S_REGION','WAVSTART','WAVEND','DISPAXIS','SPORDER','V2_REF','V3_REF','VPARITY','V3I_YANG','RA_REF','DEC_REF','ROLL_REF')>
jdavies-st commented 8 months ago

HDRTAB is a feature without a use case for JWST. For HST, there were (and are not) well-defined associations, so a drizzled mosaic could have anything as input, hence the need for tracking it in HDRTAB.

This is not the case in JWST, where every data product produced from an association has the association name in it, and the association is archived with the data, so traceability is guaranteed.

stscijgbot-jp commented 8 months ago

Comment by Howard Bushouse on JIRA:

The "blending" process involves 2 major phases: 1) Creating the HDRTAB extension, in which all of the original meta data for each input exposure is listed. Hence in the case of NIRCam multi-detector images, there will be separate entries showing all of the detectors that contributed (NRCA1, NRCA2, NRCA3, etc.). 2) Deciding what to do with the keywords in the primary header of the combined image. This is where blending rules are applied. For keywords that are known to be constant across all the inputs, the value from the 1st input is used. For keywords that have numerical values, like start/end times, total exposure time, etc., rules that compute the min/max/mean/sum etc. are used to populate the keyword in the combined image. For keywords like DETECTOR, there are rules available to set the output primary header keyword to states like "MULTIPLE" (which is what's done now).