Open braingram opened 9 months ago
@jemorrison
I opened a test PR with a modified SpecModel
schema to add dynamic units to the spec_table
.
https://github.com/spacetelescope/stdatamodels/pull/243
This works by:
spec_table_units
attribute (note that this must be defined AFTER the table so astropy does not clobber the TUNIT
keywords written when saving the attribute)spec_table_units
attribute contents to the spec_table
column TUNIT
keywordsA test was added to:
spec_table_units
attribute allows modification of the TUNIT
keywords by an external program while retaining the state on loading the file as a datamodel (although the test does not show this, if an external program adds a non-existent TUNIT
it will load in the spec_table_units
when the datamodel is opened). On read, stdatamodels will prefer TUNIT
over the contents of spec_table_units
in the tree.spec_table_units
attribute (instead of spec_table.columns['WAVELENGTH'].unit
, see the failing test). This is required to keep the tree in sync with the fits headers. On write stdatamodels will prefer spec_table_units
over any unit in spec_table.columns
.This strategy is only necessary for dynamic units (or units where we expect the user might change the unit outside the pipeline). For static units, defining them in the schema is much simpler.
Aside from the changes in #243 the test PR has only test and schema changes (no code changes appear to be necessary to make this strategy work however it might be nice to investigate how to avoid needing to define the unit attribute after the table to allow the schemas to be a bit more flexible).
FITS BinTable extensions support using
TUNIT
keywords to define units for columns within the table. The use of these keywords can be abstracted using interfaces like FITS_rec which provide access to these units via thecolumns
attribute (a ColDefs instance).Units are used in jwst code and what follows are a few examples (and by no means an exhaustive list):
1) Extract1D uses the
FITS_rec
interface to assign units to columns in a table. 1) the miri pathloss schema assigns units to theTUNIT
headers directly 1) the miri pathloss reference file containsTUNIT
headers for thePATHLOSS
BinTable 1) the mastargacq.schema also contains 'unit' entries in the datatype which appear to do nothing 1) the niswfss_apcorr schema defines the column unit separate from the table dtype (and doesn't useTUNIT
)There are some considerations when examining how the pipeline uses units:
compatibility with fits and asdf formats
As datamodels should be saveable in both fits and ASDF formats the use of
TUNIT
for saving a unit has some issues.TUNIT
fits_keyword
definition in the schema is ignored when writing an ASDF fileFITS_rec
instances will be converted to structured arrays prior to writing to an ASDF file (losing any units defined in the columns)The last option above (the niswfss_apcorr example) should work for both ASDF and FITS files (in the context of the jwst pipeline). However, opening the table directly in
astropy
(or some other FITS supporting program) will fail to associate units with the table columns as they are not using the standardTUNIT
keyword(s) (in this case usingSIZEUNIT
).attribute interface and model state
Depending on the state of the attribute that contains the table,
stdatamodels
doesn't appear to provide a consistent interface to the table or units. Some initial testing shows:DataModel
will initialize the table using the datatype defined in the schema. This returns anp.ndarray
instance with a structured datatypeFITS_rec
_cast
and ends up as aFITS_rec
after the cast (although the process currently strips the units due to needing to convert the data endinaness to native/little)