spacetelescope / jwst

Python library for science observations from the James Webb Space Telescope
https://jwst-pipeline.readthedocs.io/en/latest/
Other
568 stars 167 forks source link

exp_to_source not transferring the slit meta appropriately #7792

Closed stscijgbot-jp closed 8 months ago

stscijgbot-jp commented 1 year ago

Issue JP-3088 was created on JIRA by Jonathan Eisenhamer:

Issue

exp_to_source is copying wrong and/or incomplete meta information.

Reproduce

Take any MultiSlitModel example:

m = datamodels.open('multislitmodel.fits') m.meta.bunit_data 'DN/s' m.slits[0].meta.bunit_data 'MJy/sr' exps = jwst.exp_to_source.exp_to_source([m]) exps['1'].exposures[0].meta.bunit_data 'DN/s'

This should have been 'MJy/sr'

stscijgbot-jp commented 10 months ago

Comment by Jane Morrison on JIRA:

Jonathan Eisenhamer Howard Bushouse Nadia Dencheva (since she has looked at exp_to_source recently) 

I have some multislit data so I did exactly as Jonathon suggested:

from stdatamodels.jwst import datamodels

file = 'jw02123001003_19101_00001_nrs2_cal.fits'

model = datamodels.open(file)

print(type(model))

<class 'stdatamodels.jwst.datamodels.multislit.MultiSlitModel'>

model.meta.bunit_data

Out[{}6{}]: 'DN/s'

model.slits[0].meta.bunit_data

Out[{}7{}]: 'MJy/sr'

I was thinking the real issue is that for a multislit model model.meta.bunit_data is Dn/s. Right ? This is before exp_to_source.  It seems that is what should be fixed ? Which is in calspec2.

The exp_to_source routine sets the bunit_data for the slits according to what is in model.meta.bunit_data in the MultiSlitModel. As a little test I set that value to the same as the slit[0].meta.bunit_data and re-ran exp_to_source. When I did that results slit values had values of MJy/sr in the source slit output. 

stscijgbot-jp commented 9 months ago

Comment by Howard Bushouse on JIRA:

Fixed by #8189

stscijgbot-jp commented 9 months ago

Comment by Jane Morrison on JIRA:

Howard Bushouse Jonathan Eisenhamer  On the issue that the s2d slits do not have the correct bunit_data value. I THINK I have tracked it down to 

exp_to_source.exp__to_source

I have added one line (see below to code) and now the correct bunit_data values are in the s2d files. I added this line more just for a test - we probably don't want to add this line. 

The fact that they are not originally in the result_slit - I think is related to the line  merge_tree(result_slit.exposures[-1].meta.instance, exposure.meta.instance). I looked at that code in stdatamodels (not sure what it is doing) that would cause bunit_data not to populated. Suggestions ? Jonathan Eisenhamer comments ?

 

For an immediate work around I can  set the bunit_data in the top model to be MJy/sr in photom. This code seems assume that will be the case and in this case the correct units are in the s2d files. 

 

Longer term goal is to better understand what merge_tree is doing.

for exposure in inputs:         log.info(f'Reorganizing data from exposure {exposure.meta.filename}')

        for slit in exposure.slits:             log.debug(f'Copying source {slit.source_id}')             result_slit = result[str(slit.source_id)]             result_slit.exposures.append(slit)             merge_tree(result_slit.exposures[-1].meta.instance, exposure.meta.instance)             result_slit.meta.bunit_data  = slit.meta.bunit_data # ADDED this line                              if result_slit.meta.instrument.name is None:                 result_slit.update(exposure)                              result_slit.meta.filename = None  # Resulting merged data doesn't come from one file         exposure.close()

stscijgbot-jp commented 9 months ago

Comment by Jane Morrison on JIRA:

The bunit value is still not written to s2d slit extensions. From testing from Jane Morrison if the photom output results are written to disk then the bunit is written to the s2d file. If it is not saved then the bunit is missing. 

 

Testing by Howard Bushouse showed: When running the whole pipeline end-to-end, such that the output of photom is just passed in memory to resample_spec, the BUNIT keywords did NOT appear in the s2d files. But if I ran resample_spec standalone, using a "cal" file as input, then the BUNIT keywords get propagated over to the s2d file. So something is happening when the datamodel gets serialized to disk.  Jonathan Eisenhamer do you have any ideas what part of pipeline I should look at to fix this. I tried several things and nothing seems to work.  It certainly seems to be related to exp_to_source.py routine. Converting the results to 

DefaultOrderedDict(MultiExposureModel) seems to cause the bunit to vanish from the slit datamodel 

   

stscijgbot-jp commented 9 months ago

Comment by Jane Morrison on JIRA:

I ran calspec2 using a rate file and the s2d products did not contain the bunit.

I ran calspec3 on 1 cal file (produced by the calspec2 run) and the s2d products did contain the bunit.

There seems to be a difference in the way the multi slit container is set up in calspec2 and calspec3 - but after numerous tries I could not figure out how to update the resample+spec_step to for the multi slit container similar to how calspec3 does. 

 

stscijgbot-jp commented 9 months ago

Comment by Tyler Pauly on JIRA:

Not to further confuse the issue, but I added a small test example in ███████████████████████████████████████████ You can run the script in there (python missing_bunit_script.py) to generate the products with/without photom save_results = True, and show the difference in BUNIT values. Brett Graham

stscijgbot-jp commented 9 months ago

Comment by Tyler Pauly on JIRA:

Looking into the exp_to_source use of stdatamodel's merge_tree method, I looked into the differences present between the metadata trees of a MultiSlitModel and its SlitModels within. exp_to_source attempts to update each SlitModel with metadata from the MultiSlitModel before they are separated, and the SlitModel is sent to be housed under a MultiExposureModel - completing the purpose of exp_to_source, sorting SlitModel instances by source rather than exposure.

The differences between the two trees are shown in the attached files: [^multi_slit_model_meta_difference.txt] [^slit_model_meta_difference.txt]

Going through each node of the meta tree, I would judge that changes to 'asn', 'cal_step', and 'ref_file' are intended and useful. The difference in 'photometry' is not important, as the MultiSlitModel does not have an entry that would clobber the SlitModel entry - this is as intended.

The remaining issues are the clobbering of the SlitModel meta tree for bunit_data and bunit_err (the initial ticket issue), as well as model_type and wcsinfo.

I don't see a clear path toward merging the metadata trees without cherry-picking certain nodes to be prioritized Slit>MultiSlit or vice-versa; this could involve copying out the relevant metadata (e.g. bunit_data/err, model_type and wcsinfo) and re-applying it post-merge. Alternatively, we could apply updates to the SlitModel metadata only for asn, cal_step and ref_file nodes. Neither looks very clean, but I might prefer the latter if forced to choose.

stscijgbot-jp commented 8 months ago

Comment by Howard Bushouse on JIRA:

Fixed by #8294