spacetelescope / jwst

Python library for science observations from the James Webb Space Telescope
https://jwst-pipeline.readthedocs.io/en/latest/
Other
569 stars 167 forks source link

Initial versions of updated spectroscopic photom datamodels #4059

Closed stscijgbot closed 4 years ago

stscijgbot commented 5 years ago

Issue JP-1018 was created by Alicia Canipe:

INS is requesting initial versions of mode-separated spectroscopic photom datamodels to use to generate new spectroscopic flux calibration reference files. This will require updates to existing datamodels and some new datamodels (previously imaging and spectroscopic files were combined).

The photom reference files have multiplicative flux calibration factors.

Changes needed:

The tables below summarize the formatting updates by instrument ({color:#de350b}red text){color}:

{color:#ff8b00} THIS IS NOT FINAL YET {color} !image-2019-09-24-10-11-13-999.png!

stscijgbot commented 5 years ago

Comment by Alicia Canipe: CalWG: [~koekemoe]    SCSB: [~hodge]/[~bushouse] NIRSpec: [~cherylp]/[~epuga]/[~muzerol] MIRI: [~cracraft]/[~skendrew]/[~dlaw] NIRCam: [~hilbert] NIRISS: [~kvolk]/[~swara]

Hi all, can you please look at the information above and confirm that you agree with the flux calibration (photom) updates as we discussed in the CalWG meeting today? Is everyone in agreement with the photom column name for NIRSpec/NIRISS SOSS being called "photcorr"? I'm not sure what will be the best way to include the units in the headers given this updated file format for NIRSpec and NIRISS.

stscijgbot commented 5 years ago

Comment by Misty Cracraft: Quick question. The nelem column was added to our imager file because we combined it with the LRS data. We removed it from the Imager file when we split out the modes. Is it not needed here anymore, even if we have multi-element arrays in the wavelength and other columns?

 

stscijgbot commented 5 years ago

Comment by Alicia Canipe: I might have heard [~hodge] incorrectly – I was thinking he said we didn't need it, but that might be my mistake. Can you confirm whether or not it's needed Phil? I'll fix the table if we do need it after all.

stscijgbot commented 5 years ago

Comment by Philip Hodge: I just talked with [~bushouse] about this, and he pointed out that the array length can differ from one row to another within a table, e.g. for NIRSpec fixed-slit data, I'm sure the wavelength arrays (and therefore relresponse, etc.) will differ depending on the slit.  All the arrays in a given column have to be the same length, but that length no longer has to be specified in the schema (because Mihai fixed that problem), and the length doesn't have to be the same for all tables that use the same schema.  So we will still need an nelem column, but the arrays in the table columns only need to be as long as the longest array in a particular FITS file.  The array lengths can differ from the arrays in another file of the same type.

stscijgbot commented 5 years ago

Comment by Alicia Canipe: Thanks [~hodge]. Fixed the table to keep the nelem column.

stscijgbot commented 5 years ago

Comment by Bryan Hilbert: So it sounds like all the '500' values for wavelength, relresponse, and reluncertainty dimensions should change to say that any length can be used (but that the length has to be the same in all rows).

stscijgbot commented 5 years ago

Comment by Philip Hodge: [~hilbert], yes, but I don't know what text would fit in a small space.  Maybe your description could be given in a footnote to the table.

stscijgbot commented 5 years ago

Comment by Alicia Canipe: Thanks [~hodge] and [~hilbert]. The table is updated with a footnote indicating that arrays can be any length as long as the length is the same in all rows.

stscijgbot commented 5 years ago

Comment by Howard Bushouse: What's the meaning of the "FITS TABLE" comment in the mode column for MIRI MRS? The MIRI MRS photom ref file contains all 2-D image extensions (except for the DQ_DEF extension, which is a table).

stscijgbot commented 5 years ago

Comment by Howard Bushouse: Some of the photom models include an uncertainty/error column for the "relresponse" values, but there's inconsistency in the column names. MIRI LRS uses "relresperror", while NIRSpec IFU/MOS and FS use "reluncertainty". Now would be a good time to clean up those differences. Personally, I tend not to like the term "error", because to me it implies a mistake of some kind. What we're really talking about for these data is uncertainty in their values.

Other than that, I think the specs listed above look good, from the SCSB perspective.

stscijgbot commented 5 years ago

Comment by Howard Bushouse: For the sake of completeness, the current (and I assume future) MIRI MRS photom ref files use the BAND keyword as a CRDS selector, so you might want to add that one (in black, since it's not new) to the Selection Criteria column. That will serve as a reminder to SCSB to include the BAND keyword in the new data model schema.

stscijgbot commented 5 years ago

Comment by Alicia Canipe: I added BAND to the selection criteria and removed my confusing note to remind myself that the MIRI MRS files use the IMAGE extension and not a BINTABLE ([~dlaw] or anyone who might know – what is the reason the MRS reference files have a different format from the other modes/instruments? I don't know the history). [~skendrew]/[~cracraft], would your team have any objections to changing the "relresperror" column name in the LRS files to "reluncertainty" with this update, so that we have consistency across the instruments?

stscijgbot commented 5 years ago

Comment by Sarah Kendrew: I think that should be fine from our side yes. I'm having trouble staying on top this because we have the CPR2/3 rehearsal this week and I'm working weird hours - apologies if I am a bit slow to respond.

stscijgbot commented 5 years ago

Comment by Sarah Kendrew: we will likely also change the length of the wavelength array according to the comments from [~hodge] & [~hilbert] above. 

stscijgbot commented 5 years ago

Comment by Howard Bushouse: Regarding the "reluncertainty" column, for consistency we should add to all of the new spectroscopic data models (right now it's not listed for the NIRCam and NIRISS models). If the teams aren't able to derive values for it, they can be zero-filled.

stscijgbot commented 5 years ago

Comment by Alicia Canipe: [~bushouse] I updated the table so the "reluncertainty" column is included for NIRCam ([~hilbert]), MIRI ([~cracraft]/[~skendrew]), and NIRISS ([~kvolk]).

[~kvolk] can you confirm that you agree with the updated spectroscopic photom datamodels above?

[~pena], [~muzerol], or [~cherylp] can you confirm that the format of the spectroscopic photom (flux cal) datamodel above works for NIRSpec? This is based off the discussion we recently had with the CalWG.

stscijgbot commented 5 years ago

Comment by Kevin Volk: I had not proposed to have an uncertainty value for the spectroscopic calibration function because this is, in my view, very difficult to quantify in a useful way.  I would need to know exactly how this value is to be used in the pipeline before I can say "yes" or "no" to the addition.  If it is purely for information purposes and not used for quantitative calculations then it is fine.  If the pipeline proposes to propagate the uncertainties here into the output spectra then I need to know exactly how this would be done and thus how the values need to be defined.  Right now the input values are simulations and have no uncertainties.  On-sky we can measure different standard stars and get an uncertainty in the photometric scaling, but it will not happen quickly.

stscijgbot commented 5 years ago

Comment by Howard Bushouse: [~kvolk] As of right now the "uncertainty" and "reluncertainty" entries are purely information only. They are not used (yet) for any error propagation.

stscijgbot commented 5 years ago

Comment by Kevin Volk: Given the above the data model proposal for NIRISS seems fine. 

stscijgbot commented 5 years ago

Comment by James Muzerolle: NIRSpec only needs one "photcorr" entry: the point source conversion factor in MJy/DN/s.  The extended source calibration to surface brightness units should be calculated in the photom step using that factor and dividing by the pixel area corresponding to the aperture, which should be taken from the appropriate pixel area reference file.  In that case, src_type should not be needed as a selection criterion, only a decision point in the step code.

Also, there needs to be separate files for MOS and IFU, as those modes will have their own dedicated calibrations.

 

stscijgbot commented 5 years ago

Comment by Alicia Canipe: Thanks, [~muzerol]. I updated the specifications table according to your comment. Please verify that it matches your description.

Since this will also affect NIRISS SOSS mode, [~kvolk] can you make sure you're okay with the implementation described above by James and the updates to the datamodel table for SOSS?

[~bushouse]/[~hodge] this is a slight modification to what we originally discussed. Do you have all the information you would need from us?

stscijgbot commented 5 years ago

Comment by James Muzerolle: Looks good from my perspective.

stscijgbot commented 5 years ago

Comment by Howard Bushouse: This should work OK to have the photom step take care of doing any conversion when necessary. The photom step would've needed to know the srctype value anyway in order to select the proper table row to use. Now it'll use that information to decide whether it needs to convert the photcorr/photmjsr value before applying it to the data. It will also need to know the pixel area, but that should be available in either the photom ref file or the pixel area ref file.

I assume there's still a preference to name the column "photcorr" instead of "photmjsr", because the latter implies incorrect units? Of course now we're up to 4 of the 7 models using the name "photcorr", at which point I'm tempted to suggest that ALL of them just use "photcorr" (for consistency), but to be completely consistent we'd want to go back and change the name for the imaging mode models as well, and that's probably more work that folks would rather not do.

stscijgbot commented 5 years ago

Comment by Howard Bushouse: [~muzerol] It's easy to have separate photom ref files for the MOS and IFU modes in CRDS, but because their structure and format is the same, we only need one data model defined for them. Folks will just need to remember that "NrsMosPhotomModel" can be used for MOS and IFU photom ref files.

stscijgbot commented 5 years ago

Comment by James Muzerolle: Agreed, it should be fine to have the same data model for MOS and IFU.  I wasn't sure if the table entries were grouped by data model only, or also by reference file, so just wanted to clarify in prevent any confusion.

stscijgbot commented 5 years ago

Comment by Alicia Canipe: Okay, so I changed the table back to MOS and IFU being together – is NrsMosPhotomModel an acceptable name for NIRSpec and SCSB?

Just need verification from [~kvolk] that the above change (the photom step making the decision about source type instead of having a separate column and units) is acceptable for NIRISS SOSS mode. After that, I think we can say the format is ~finalized.

stscijgbot commented 5 years ago

Comment by Kevin Volk: Since the SOSS mode only observes point sources, there is no "decision" to be made about the source type.  So the photom step can do it, no problem, but it is a null decision in this case.

stscijgbot commented 5 years ago

Comment by Alicia Canipe: Thanks, [~kvolk]. I just wanted to make sure having only one photcorr value in units of MJy/(DN/sec) was okay for NIRISS SOSS.

[~bushouse] I will pose the question about switching all the column names to "photcorr" for all instruments so that we can standardize it in the DMSWG meeting this afternoon. Then I think we will be good to move forward with this.

stscijgbot commented 5 years ago

Comment by David Law: Adding some MRS notes: The reason MRS has a different format is because it has a fundamentally different data structure, with data from completely different filters/wavelength ranges sharing the same detector at the same time.  One set of calibrations is valid for one part of the array, and another set for the other part.  Since the operation that needs to be done in the pipeline is to divide (or multiply) pixel-wise values the simplest approach is to have a pixelwise array of correction values that can trivially incorporate different corrections for different parts of the detector.  To be sure I'm following: is the idea that other instruments/modes will have a column named 'photcorr' and a header keyword named 'photmjsr'?  Are we doing away with that header keyword and assuming that the data correction is directly to whatever the final calibrated units are?

stscijgbot commented 5 years ago

Comment by Alicia Canipe: Okay, [~bushouse], I think this is the ~final version of the format.

[~dlaw] just to follow-up after the meeting - the flux cal factors stored in the photmjsr or photmj columns contain the multiplicative factors in either MJy/sr/(DN/sec) or MJy/(DN/sec), respectively, depending on the instrument/mode. These are the values that will be applied to the data. For consistency, the MRS files should contain multiplicative factors that are in MJy/sr/(DN/sec), as well.

stscijgbot commented 5 years ago

Comment by Howard Bushouse: Updated data models merged into jwst repo in [https://github.com/spacetelescope/jwst/pull/4096]

stscijgbot commented 5 years ago

Comment by Alicia Canipe: Copying from JP-1004: To test or use the new data models (both imaging and spectroscopic) you can install the latest master branch of the jwst repo from github using: {{conda create -n jwst_dev python}} {{conda activate jwst_dev}} {{pip install git+[https://github.com/spacetelescope/jwst]}} Of course you can call the new conda environment anything you want. I've just used "jwst_dev" for this example.

stscijgbot commented 5 years ago

Comment by Alicia Canipe: Copying the E-mail I sent with instructions here: {quote}{color:#505f79}Howard has provided instructions for installing and testing the updated imaging and spec photom datamodels. I will make sure the Jira tickets also have this information (https://jira.stsci.edu/browse/JP-1004,   https://jira.stsci.edu/browse/JP-1018). The goal is to get these files delivered by October 18^th^, but the sooner they are delivered the more time we have to troubleshoot issues.{color}

{color:#505f79} {color}

{color:#505f79}+To use the new datamodels+{color} {color:#505f79}To test or use the new data models (both imaging and spectroscopic) you can install the latest master branch of the jwst repo from github using:{color}

{color:#505f79} {color}

{color:#505f79}conda create -n jwst_dev python{color}

{color:#505f79}conda activate jwst_dev{color}

{color:#505f79}pip install git+[https://github.com/spacetelescope/jwst]{color}

{color:#505f79} {color}

{color:#505f79}You can call the new conda environment anything you want. "jwst_dev" is just an example. Please comment in the Jira tickets if you have any issues with the new datamodels.{color}

{color:#505f79} {color}

+{color:#505f79}To deliver new photom files{color}+

{color:#505f79}When you are happy with your version of the new files, please follow Misty’s instructions to deliver them:{color}

{color:#505f79} {color}

{color:#505f79}The EPIC ticket is https://jira.stsci.edu/browse/CRDS-302 --> Read this for instructions on JIRA ticket handling.{color}

{color:#505f79}When files are ready to deliver, open a new CRDS ticket for each instrument/file type.{color}

{color:#505f79}Link to the EPIC ticket.{color}

{color:#505f79}Make sure that ReDCaT is ready for the files in discussion in the ticket.{color}

{color:#505f79}Go to the test server: jwst-crds-test.stsci.edu.{color}

{color:#505f79}Using the Extended Batch submit tool, fill out the delivery form, making sure to list the JIRA ticket being used.{color}

{color:#505f79}Any problems should be discussed in the JIRA ticket.{color}

{color:#505f79}Once the files are delivered and confirmed fine by ReDCaT, the JIRA ticket will be closed.{color}{quote}

stscijgbot commented 5 years ago

Comment by Bryan Hilbert: I'm having some trouble putting together spectroscopic photom files with the new NrcWfssPhotomModel. My code that used to work no longer does. It fails when I zip all of the table columns together into a numpy array using the datamodel datatypes. I'm betting that I'm just doing something wrong, but I haven't been able to figure it out. 

Here are details on my input data types:

Variable, shape, datatype of each element:

filters (24,) <class 'numpy.str'> pupils (24,) <class 'numpy.str'> orders (24,) <class 'numpy.int16'> fluxes (24,) <class 'numpy.float32'> uncs (24,) <class 'numpy.float32'> nelems (24,) <class 'numpy.int16'> waves (24, 3000) <class 'numpy.float32'> resps (24, 3000) <class 'numpy.float32'> resps_unc (24, 3000) <class 'numpy.float32'>

And here are the expected datatypes for the NrcWfssPhotomModel:

[('filter', 'S12'), ('pupil', 'S15'), ('order', '<i2'), ('photmjsr', '<f4'), ('uncertainty', '<f4'), ('nelem', '<i2'), ('wavelength', '<f4'), ('relresponse', '<f4'), ('reluncertainty', '<f4')]

The failure comes when I try zipping all the columns together:  alldata = np.array(list(zip(filters, pupils, orders, fluxes, uncs, nelems, waves, resps, resps_unc)), dtype=NrcWfssPhotomModel().phot_table.dtype)

The exception makes it sound like I'm using the wrong data type or that one or more of my arrays has the wrong shape:

ValueError: setting an array element with a sequence.

I will note that if I make the waves, resps, and resps_unc columns into 1D numpy arrays rather than 2D, then I am able to zip everything together with no error. But in that case I get an error when trying to save the datamodel, because those arrays have the wrong number of dimensions, as expected.

stscijgbot commented 5 years ago

Comment by Howard Bushouse: The whole zipping operation was a kludge that we never should've been using (now that I've had a few years to discover better ways). What you really want to construct is a python structured array, consisting of a bunch of row tuples, where each row tuple consists of the contents of each column entry. I'm attaching a sample python script that I used to test the MIRI LRS model. Look for the operations done on the "data_list" variable.

stscijgbot commented 5 years ago

Comment by Howard Bushouse: Note that the one downside to the fact that we're no longer specifying the lengths of the wavelength, relresponse, and reluncertainy arrays in the data model schema is that you have to specify formats for those columns when you create the table (as you can see in the example). And I don't know of a way to specify the formats for only those columns, so for now it just specifies the formats for all columns,

stscijgbot commented 5 years ago

Comment by Sarah Kendrew: [~bushouse] thanks for providing the script - very helpful. One comment though is that we have 2 different values of the mean pixel area, as LRS fixed slit and slitless use different parts of the detector and the mean areas are slightly different because of the distortion across the field. Right now the data model has just the one value in the metadata, is it possible to have these pixel areas added to the table so we can provide mode-specific values?

cc [~cracraft]

stscijgbot commented 5 years ago

Comment by Todd Miller: During the NIRCAM delivery we noticed that ORDER should be added to the NIRCAM row selection criteria in addition to the current FILTER and PUPIL columns.

stscijgbot commented 5 years ago

Comment by Alicia Canipe: [~jmiller] and [~hilbert] I realize now I added ORDER to the rows but not to the selection criteria list for NIRCam, sorry about that. Updated specs posted now.

stscijgbot commented 5 years ago

Comment by Howard Bushouse: [~skendrew] Regarding different pixel areas for LRS FIXEDSLIT and SLITLESS, I would advise against modifying the layout and contents of the actual table data at this late stage of the game. Another workaround is to simply create separate photom ref files for the 2 different modes, which will work because EXP_TYPE is now a CRDS selection criterion.

stscijgbot commented 5 years ago

Comment by Sarah Kendrew: OK, let's try splitting into 2 reference files so we can maintain the two separate mean pixel area values. I've attached 2 files, one with SLIT and one with SLITLESS in the filename. Can someone check these?

 

stscijgbot commented 5 years ago

Comment by Todd Miller: Aside from the missing BAND value Misty handled yesterday,  CRDS should accept these.

As you noted yesterday BAND isn't yet supported by the CAL datamodels so it needs to be added manually as a simple FITS keyword.   Just set it to N/A.

 

stscijgbot commented 5 years ago

Comment by Howard Bushouse: You're right - the BAND keyword is defined in the new MirImgPhotomModel and MirMrsPhotomModel, but we missed it in the new MirLrsPhotomModel. We'll correct that ASAP.

stscijgbot commented 5 years ago

Comment by Howard Bushouse: Some folks have asked about adding HISTORY keywords to the reference file headers. History is a special part of all data models, which doesn't need to be defined in the schema itself. You can add history keywords to an existing data model using, e.g.

{{output.history.append("MIRI LRS photom data created 14-Oct-2019")}}

{{output.history.append("Created from data in files:")}}

{{output.history.append("  jw0001.fits")}}

{{output.history.append("  jw0002.fits")}}

stscijgbot commented 5 years ago

Comment by Sarah Kendrew: Thanks for the updates! Can someone confirm that our LRS files are now in good shape for delivery? Let me know what else is needed.

stscijgbot commented 5 years ago

Comment by Misty Cracraft: If the files still need BAND added, I can update the headers and deliver the files tomorrow.

stscijgbot commented 5 years ago

Comment by Alicia Canipe: [~muzerol] I think we are nearly done with delivering the updated photom reference files using the new datamodels for MIRI, NIRISS, FGS, and NIRCam. I'm not sure who the NIRSpec contact would be for your team, in this case – let me know if NIRSpec has any questions or issues using the new datamodels to create and submit the files.

stscijgbot commented 5 years ago

Comment by James Muzerolle: [~gkanarek] is working on this for NIRSpec, I just added him as a watcher on this ticket.

stscijgbot commented 5 years ago

Comment by Misty Cracraft: If the files still need BAND added, I can update the headers and deliver the files tomorrow.

Edit: I've edited the headers and am tracking the delivery in CRDS-308.

stscijgbot commented 4 years ago

Comment by Alicia Canipe: Work is complete