spacetelescope / jwst

Python library for science observations from the James Webb Space Telescope

https://jwst-pipeline.readthedocs.io/en/latest/

Other

559 stars 167 forks source link

Convert IFU extract1d and apcorr reference files from fits to asdf #5408

Closed stscijgbot-jp closed 3 years ago

stscijgbot-jp commented 3 years ago

Issue JP-1749 was created on JIRA by Jane Morrison:

A new extract1d reference file for IFU data has been created. This preliminary file has been supplied by the MIRI INS team (David Law). JP-1736 (closed) created a datamodel for this new reference file. The reference file is a fits file containing two extensions. The first extension contains basic data (single values in a table). The second extension contains a table with integer values and five columns with arrays (size set by integer value in the table). This type of data may not be easily represented by an asdf file and it may be appropriate to leave it as a fits file.

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

Howard Bushouse James Davies Jonathan Eisenhamer

I am not sure who I should assign this to. For now I put myself.

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

James Davies Jonathan Eisenhamer

I created a new simpler datamodel for the IFU extract1d and I used it to attempt to write out the fits to asdf converted file.

I wrote a little program that read in the extract1d MRS IFU fits files. I instantiated the new extract 1d data model. I read in the stuff in the fits file and shoved it into the data model.

Then I as like - what now ? So I set the filename to be 'test.asdf' and I did: model.save('test.asdf')

Comparing it to another asdf file it looks sort of ok except when I guess the data in the arrays is written out - it is not readable. It is like trying to read a postscript file (if you know what I mean).

For example see file below: (any pointers to what I need to add to this ?)

I am not sure where to put this py script that does this - for review - because I don't think we have scripts like this any where else.

(dev820) arizonajane2:Make_ASDF morrison$ more test.asdf

ASDF 1.0.0

ASDF_STANDARD 1.5.0

%YAML 1.1

%TAG ! tag:stsci.edu:asdf/

--- !core/asdf-1.1.0

asdf_library: !core/software-1.0.0 {author: Space Telescope Science Institute, homepage: 'http://github.com/spacetelescope/asdf',

name: asdf, version: 2.7.1}

history:

extensions:

- !core/extension_metadata-1.0.0

extension_class: asdf.extension.BuiltinExtension

software: !core/software-1.0.0 {name: asdf, version: 2.7.1}

axis_pa: !core/ndarray-1.0.0

source: 0

datatype: float32

byteorder: big

shape: [2346]

offset: 46922

axis_ratio: !core/ndarray-1.0.0

source: 0

datatype: float32

byteorder: big

shape: [2346]

offset: 37538

id: ANY

meta:

date: '2020-10-21T14:55:03.278'

description: Default MIRI MRS Extract1d parameters

filename: test.asdf

instrument: {band: N/A, channel: N/A, detector: N/A, name: MIRI}

model_type: MirMrsExtract1dModel

telescope: JWST

method: subpixel

nelem_wl: 2346

radius: !core/ndarray-1.0.0

source: 0

datatype: float32

byteorder: big

shape: [2346]

offset: 9386

region_name: target

subpixels: 10

subtract_background: true

wavelength: !core/ndarray-1.0.0

source: 0

datatype: float32

byteorder: big

shape: [2346]

...

BLK^@0^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@qk^\y^KZH@<9C>N@@<9C><9F>@<9C>@<9D>C@<9D><95>@<9D><99>@<9E>9<85>@<9E><8B>p@<9E>

\@<9F>/G@<9F><81>3@<9F>^^@% @v@@^Z@l@@^P<8F>@b{@f@^FR@X=@)@^T@N ^@@<9F>@@C@<95>@<99>@9<85>@<8B>p@

\@/G@<81>3@^^@% @v@@^Z@l@@^P<8F>@b{@f@^FR@X=@)@^T@N ^@@<9F>@

stscijgbot-jp commented 3 years ago

Comment by James Davies on JIRA:

The arrays get written out as binary blocks in the file. To investigate it, read it in in python:

import asdf

af = asdf.open("test.asdf")
af.info()

wavelength = af.tree["wavelength"]
radius = af.tree["radius"]

etc.

You might want to put those items 1 level lower. I.e. in a "data" section so when you read it in:

table = af.tree["data"]
radius = table["wavelength"]
...

Or if you have a datamodel that reads this in, then even easier.

stscijgbot-jp commented 3 years ago

Comment by James Davies on JIRA:

Also, for an ASDF file, you probably don't need some set number of elements in an array. As long as radius and wavelength are the same length, they can be any length. So nelem_wl might not be needed either. No need for padding.

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

I got a bit confused. So I am trying to make an asdf file. I have an array of 7 different arrays. One of them is wavelength. So how do shove these arrays into an asdf table ? Should I import asdf or import astropy.table and use astropy table method ?

Or maybe I am not understanding what wavelength = af.tree['wavelength'] does is that reading in a wavelength from a table or writing it out ?

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

success. I finally have an asdf file of the miri_extract1d fits reference file. It is not done because I need to figure out how to add the units of data to file. But David Law take a look at ████████████████████████████████ I wrote the arrays so they are readable not binary. The default is binary for these size arrays. I like to be able to read it easily but we can also store it as binary.

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

I have a asdf file ready to review. It can be found at █████████████████████████████████

The original miri extract1d fits reference file is also in that directory. The asdf file have the same root name but ends with asdf.

The python script to make this file is also located in that directory. The PR #⁠5383 contains the datamodel that was used to make this reference file and will be use to read in the the extrac1d reference file once we finalize the format of this new asdf file.

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

David Law

I am not sure if we need 'id' in the first extension of the extract1d ref file. That variable is not used in the code. Should it be ?

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

David Law James Muzerolle The NIRSpec extract1d ref file has values for filter and grating. To be consistent should the MIRI one also have Channel and Band - similar to the miri apcor ref file ?

stscijgbot-jp commented 3 years ago

Comment by Rosa Diaz on JIRA:

Currently these files are JSON files. Are we saying that these will become ASDF files?

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

We will remove the JSON files for IFU data and replace with an ASDF file.

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

James Muzerolle David Law I converted by the MIRI and NIRSPEC fits files to asdf. I realized we have not thought out what to do in case of NIRSpec data that covers a single grating but multiple bands. How do we select the radius to use ?

For calspec2 output we are fine that will alway have a single grating/single filter. But for calspec3 output I believe we could have a single grating but multiple filters - so how do we select the radius ?

Also cube_build will build a cube (off line) for anything you toss at it. So it could cover all the gratings and all the filters. Not that may folks would do that - but what should extract1d do in this case ?

James is it possible to just create a single set of array what vary with wavelength but not have it depend on the grating/filter or at a minimum have a set for each grating

stscijgbot-jp commented 3 years ago

Comment by James Muzerolle on JIRA:

Jane Morrison The only gratings that work with more than one filter are G140M/H, though the F070LP case will likely be rare for IFU mode because the blue detector cut-off truncates most of its unique coverage. However, I think what you're really talking about is combination of multiple gratings of the same resolution, which I didn't have in mind when creating the file. In that case, the only issue would be if the reference file specified a different radius in the wavelength overlap regions, which is probably unlikely. In principle, we could just have a single wavelength-dependent vector for all dispersers. There might be a compatibility problem with our aperture correction reference file, since that currently has disperser-dependent vectors.

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

James Muzerolle So I am mainly worried about IFU cube that come out of spec3 pipeline. I think in that case the association is set up to only use data from grating of the same resolution. Is that correct ?

I actually have forgotten do any of the gratings have the same resolution so we could have calspec3 pipeline products created from more than 1 grating ? The simplest thing would be if we don't have to have an extraction radius that depends on the grating and filter.

But as you say then we might have problems with the aperture correction reference file. How much does the aperture correction depend on grating/filter ?

stscijgbot-jp commented 3 years ago

Comment by David Law on JIRA:

Jane Morrison Regarding the 'id' entry, I have no idea what this is for, I just copied it from the json file. If it's not used, and we don't even know what it is, I'd be inclined to remove it.

Regarding gratings/channel/band: I added channel/band to the apercorr reference file with the notion that even though we didn't need it right now it would be consistent with NIRSpec and allow us to do different things for different channels in future if we want. I'm not sure how likely it is that we'd need a different aperture correction at the same wavelength in the band overlap region, but I suppose it's conceivable. You're right though that if we have it for aperture correction it might make sense to also have it for the spectral extraction file.

Assuming that James Muzerolle finds that NIRSpec aperture correction depends enough on grating/filter that it can't be ignored, what I suggest is that we:

Both MIRI and NIRSpec add grating/filter or channel/band to both the Extract1d and AperCorr reference files. They should minimally have an entry for CHANNEL=BAND='ANY' (or GRATING=FILTER='ANY'). This is what MIRI has right now, which is a set of vectors that span the entire wavelength range of the instrument.
NIRSpec has entries for specific channel/band/grating/filter as necessary. MIRI could do this in future if necessary.
When extract1d runs, if it's operating on a single grating/filter/channel/band and there is a corresponding entry for it, it should use those values. Otherwise if running on multiband or there are no channel-specific entries, it should do what it does right now and pull out the wavelength ranges that it needs from the ANY entries.

Would that make sense?

stscijgbot-jp commented 3 years ago

Comment by James Muzerolle on JIRA:

Jane Morrison Among the "M" or "H" gratings, the average resolution is about the same. However, each has a dependence with wavelength such that the resolution of the longer-wavelength adjacent grating will be considerably lower in the region of overlap with the shorter-wavelength grating. E.g., see attached figures for G235M and G395M.

Only gratings of the same average resolution should be combined ("M" or "H"). PRISM should never be combined with anything else. I think these rules have been incorporated into calspec3, though it hasn't been tested.

In principle, the aperture corrections should primarily depend on wavelength. However, we don't have the data to definitively rule out specific grating/filter dependences that need to be taken into account, so we included each case separately in the reference file.

stscijgbot-jp commented 3 years ago

Comment by Howard Bushouse on JIRA:

The rules regarding filter/grating combinations in associations are in the ASN generator rules. Right now they're setup to allow for multiple filters with the same grating, but not multiple gratings. This is due to the current performance issues we have with building large NIRSpec IFU cubes, so we intentionally limit level-3 ASN's to only 1 grating. In the future than could be relaxed to allow the combinations you indicate above (e.g. an M with another M, or an H with another H).

stscijgbot-jp commented 3 years ago

Comment by James Muzerolle on JIRA:

Just now seeing David Law's comment. That sounds like a reasonable approach.

The more I think about it, the less I see the need for separate aperture radius vectors per disperser. If there is a non-negligible effect on the PSF, it can be dealt with using different aperture correction values. Let me run it by the team, and I can send you a new extract_1d file by COB tomorrow.

stscijgbot-jp commented 3 years ago

Comment by James Muzerolle on JIRA:

Howard Bushouse Thanks for the clarification! I couldn't quite remember where we ended up with that, I know at one point there was also an issue with outlier_detection and the wavelength scale... Regardless, I think David's proposed approach should provide enough flexibility for either a single-grating or combined cube.

stscijgbot-jp commented 3 years ago

Comment by David Law on JIRA:

On the topic of ASDF:

I think I see how the conversion is being done. Rather than need a conversion script I would probably crib the necessary code from Jane's script and write the reference file out in this form directly.
How would asdf support the structure that we were just sketching out above, in which there are potentially multiple 'container' rows with their own values of FILTER or GRATING and aperture correction arrays in the columns corresponding to each? I don't have a good feeling for how asdf handles multidimensional information nested in tabular container objects.
Do we know why reading in the asdf version of this file takes ~3x longer than the FITS version?
It looks like the information that used to be in the 1st extension of the FITS file (simple numbers) has been collapsed into the meta. information along with the information that used to be in the 0th extension header. As such there's now only the equivalent of one 'data' extension, but -does asdf have a similar method to FITS of breaking information into conceptually-related named 'extensions' of information or is everything stored/retrieved in a flat format? I realize this is perhaps more of a general question, but relating to the overall utility of storing reference files in asdf format. (Edit, answered my own question as YES)-

stscijgbot-jp commented 3 years ago

Comment by Howard Bushouse on JIRA:

Jane Morrison regarding the "id" value: that is used in the original json versions of the extract1d reference files as a selector for the case where the ref file contains extraction params for more than one mode or aperture/slit. For NIRSpec fixed-slit, for example, "id" contains the names of the 5 fixed slits and hence allows for having different extraction params for each slit (the extract_1d step finds the id value that matches the slit name being processed). Similarly for MIRI LRS where it's used to select different params for FIXEDSLIT vs SLITLESS data. So if the new reference file doesn't need to do this kind of selection, then you don't need the "id" value.

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

James Muzerolle David Law

I like David's suggestion on the form of the reference files. It is similar to how cubebuild works. I will just need to put a little logic in to extract1d to know when the s3d cube is from a single grating/band or more than one combination. David when you can make a reference file (maybe after the miri rehearsal) I will play around with it in making as asdf file. I know the calibration developers like asdf because it is python, but this reference file is getting a bit complicated so lets just see where this all goes. I personally like being able to use fv to open a binary fits table and quickly see the structure and arrays.

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

David Law James Muzerolle

In did update the py script to convert the fits to asdf for NIRSpec. I could not figure out how to use the datamodel to write out the nirspec file using the simple asdf file I created for MIRI. So instead I used a dictionary to write out the file. I basically defined

model['data'][filter][grating] - then arrays for each case. I updated the code and I put the NIRSPEC asdf file there.

████████████████████████████████

I stopped there. I was unsure how to handle the case of more grating/filter case in the actual code. If we go for an array of values for each grating/filter and then a set for ANY - I might need to make the datamodel more complex to handle all the types - but for now I will wait and see if that is how we go.

stscijgbot-jp commented 3 years ago

Comment by James Muzerolle on JIRA:

Jane Morrison I think we've decided to have the extract_1d file be just a single vector, but the aperture corrections will remain a function of disperser. I'll have a new fits version of extract_1d later today.

The current version of the aperture correction file is more complicated, containing separate arrays for different aperture sizes and "pixel phases". This is because it was originally thought that the same file/data model could cover both MOS and IFU. However, the additional content is only needed for MOS data. I'm thinking it would be clearer in the long run to have a separate file for IFU; we only really need a single aperture size, and the pixel phase is unimportant.

jemorrison commented 3 years ago

@jmuzerolle when you get put the next extract_1d file some place for me to grab. It seems straight forward and we could probably convert it to asdf and it would be clean. The one thing I don't like is how I ended up storing the units - in the meta data. That may not be correct. I will make another attempt at figuring out the units. I agree that it seems a separate apcorr file for IFU and MOS seems best.

stscijgbot-jp commented 3 years ago

Comment by James Muzerolle on JIRA:

Jane Morrison You can find the latest version of the extract_1d file in ███████████████████████████████████████████ I added the second extension with parameters, and I also put in the units for each column in extension 3.

stscijgbot-jp commented 3 years ago

Comment by David Law on JIRA:

Jane Morrison I've started tinkering around now with constructing the Extract1D reference file directly in an asdf format, see if https://stsci.app.box.com/file/735671899062 matches about what you'd expect. If so I'll work on the apercorr file as well. By the way, with BAND and CHANNEL, remind me whether those should be N/A or ANY? Code for now at https://github.com/STScI-MIRI/miri3d/blob/master/miri3d/x1d/make_x1d_asdf.py

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

James Muzerolle David Law

I updated the datamodel for the ifuextract1d. You can find the simple data model for both NIRSPEC and MIRI in PR #⁠5383

I used this data model to convert the MIRI and NIRSpec fits files to asdf. I put the code for the convert and the converted asdf files

on ████████████████████████ Take a look at these files. The one thing I am not fond of with asdf is it writes out the arrays in alphabetic order. So the data arrays are before the meta (which contains the first extension).

We can change the datamodel if you like - just let me know. I am going to put in a ticket for the folks who work on asdf library to have meta data written before data.

There are a few questions/suggestions

with the datamodel method we do not need NELEM_WL in the table so it can be removed

we do not need ID in the first extension any more - David that can be remove (James already removed it)

James there are a few keywords I think are standard for reference files that are missing. I just hard coded them in to conversion code for NIRSpec:

instrument should be NIRSPEC (all caps - the checker for datamodels gives an error if it is not)

missing 'use after', 'version' and typo in 'exp_type' (P_EXP_TY)

Also the 'history' is not copied into the asdf file (I forgot). I can add that or you can add that to your code - whatever you like.

So I look over the convert_ifu_extract1d.py code and look over the asdf files. DO you want any changes ? If not I highly suggest that you take the convert_ifu_extract1d.py code and merge it into the code you use to create the reference files. It is better to have 1 reference file creation code - rather than 2 steps.

I think we are pretty close. So I am going to change the extract1f code to use the new data model (instead of the JSON file or the FITS files). This will clean up the code quite a bit.

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

David Law I just saw your comment. I was going to ask you. I don't think we need CHANNEL or BAND

I think we just need exp_type, since channel and band are not used in the selection process or in the code. BUt I am not sure.

jemorrison commented 3 years ago

@drlaw1558 Yeah I think that asdf file you created ilooks good except that I updated the datamodel today so the the units of the data arrays are in the data sections. That seemed more logical after I thought about it . What do you think ? If you can use the datamodel I have on the branch #5383 to make your file then it will be seemless to read it in in extract1d.

stscijgbot-jp commented 3 years ago

Comment by David Law on JIRA:

Jane Morrison One of the issues I've got with the datamodel approach is that it seems to put the process backwards and make it much more complicated to change the file format in future or to create a new file (if you can't make a new file without reading in a datamodel, and you can't create a datamodel without knowing what the new file needs to look like). I'd rather add a check against the datamodel at the end of the code that can be turned off when we know we're making an update that deliberately changes the format. Can you remind me how to do this check?

I'll take a look at your latest script to figure out how to move the unit keywords- I agree that's a more logical place for them.

stscijgbot-jp commented 3 years ago

Comment by David Law on JIRA:

Ok, I've updated the asdf file on Box again.

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

David Law

I have spent the morning learning more about asdf. So if you use the datamodels to create the reference file there are many benefits. It will work seamlessly with the pipeline, we just use datamodels.open(ref_file) and the data is loaded. BUT - this is really cool - you can add arrays to the model in your code using the datamodels and it will just work. So it allows you to add things to the reference file and still use the data models. Now eventually we will have to update the datamodels to use the new information but we would have to update the code anyway to use a new reference file. I put my example in convert_ifu_extract1d.py on █████████████████████████████████

This is what I did

model = datamodels.IFUExtract1dModel()

Fill in like before

lets add some new test arrays

wavelength = model.data.wavelength * 500 # just make something up to test

model.data.aa = wavelength

When the asdf file is written out data.aa is there. See miri-extract1d-59121_withnewarray.asdf WOW.

Also I researched that if the reference file is created another way - say using a dictionary. You can write the file out and then read it in using asdf open and to check that it is compatible with datamodels do this

af = asdf.open(output_file, custom_schema="http://stsci.edu/schemas/jwst_datamodel/ifuextract1d.schema")

af.validate()

stscijgbot-jp commented 3 years ago

Comment by David Law on JIRA:

Thanks Jane Morrison , that's helpful to know. I'm trying to figure out why my validation isn't working properly; I'm on the latest version of your branch, and deliberately creating a deficient reference file missing the 'wavelength' array. Validation keeps passing it with no problems though!

E.g., I'm writing out my new file

ff.write_to(outfile,all_array_storage='inline')

Then reading it back in using the data model:

model=IFUExtract1dModel(outfile)

model.validate()

and receiving no errors from the validation. Similarly if I do:

_af = asdf.open(outfile, custom_schema="http://stsci.edu/schemas/jwst_datamodel/ifuextract1d.schema")_

af.validate()

Although the method above initially gave me errors about float32 vs float64, once I fixed those it doesn't complain about the completely-missing wavelength array. Also, how does it grab that schema when the web address in the line doesn't exist? Does it not actually pay attention to the and grab it from the local jwst product instead?

Likewise if I do it another way as

_with IFUExtract1dModel(outfile,strictvalidation=True) as im:

print("Looks ok")

it also passes ok.

Any thoughts? It seems worrying that validation isn't picking up on such a major deliberate omission.

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

I will have to ask someone about validate. I put in an incorrect datamodel and it resulted in a much of errors. But I think the way to go is using the datamodels then we do not have to worry about validation and you can add things as you want and it will still work. Then we know that it will work in the pipeline.

I think validate checks if there is something incompatible not that there are missing arrays because it is trying to be flexible and allow users to only use part of the schema. I think.

stscijgbot-jp commented 3 years ago

Comment by David Law on JIRA:

Hm, indeed- if I just read in a blank IFUExtract1dModel, populate none of the necessary information, and then write it back out again I don't get any complaints from the validation routines. For the time being I'll just ignore validation, but it would be good to understand what exactly validation is meant to do if it doesn't test that the file contains the data elements actually required by the pipeline. Should I file a new ticket on this to separate the discussions?

jemorrison commented 3 years ago

You can also ask James Davies about asdf validation. That is who I go to. So are you planning on using the datamodels to create the asdf file ? I think is straight forward. When datamodels is used to create a ref file it writes that to the file. Then in the code I can use datmodels.open and the reference file is loaded. If you don't use datamodels I think I need to use asdf.open and then I have to figure out how to shove the asdf file into the datamodel. Let me know I am going to try and update the extract1d code to use the new ref file and new datamodel today

drlaw1558 commented 3 years ago

I'm planning to move over to using the datamodel for Extract1D (since it exists), but am doing some testing to understand things better at the same time. It looks, for instance, like even if I make the asdf file without the data model and construct the dictionary directly it still reads in ok using datamodels.open so long as I've populated the datamodl keyword in the metadata. So, I'd say go ahead with using datamodels.open and I'll just make sure that whatever I give you later today can open that way.

stscijgbot-jp commented 3 years ago

Comment by David Law on JIRA:

Nearly done with converting to using the data model to generate the file, but running in a problem with the history entry. I'm doing:

entry = util.create_history_entry("1D Extraction defaults") model.history.append(entry) entry = util.create_history_entry("DOCUMENT: TBD") model.history.append(entry) entry = util.create_history_entry("SOFTWARE: https://github.com/STScI-MIRI/miri3d/tree/master/miri3d/x1d/make_x1d.py") model.history.append(entry) entry = util.create_history_entry("DATA USED: CDP-7") model.history.append(entry)

but when I try to convert the model into an asdf file (ff=asdf.AsdfFile(model)) I get crashes saying:

AttributeError: 'HistoryList' object has no attribute 'entries'

Is there a different way that I should be adding history entries? I've been following https://jwst-pipeline.readthedocs.io/en/latest/jwst/datamodels/models.html#history-information

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

I was able to add HISTORY by using ASDF 'add_history_entry'

example:

ff = AsdfFile(model) # where model contains al the meta and data arrays

ff.add_history_entry('DATA USED: CDP-7')

ff.add_history_entry('SOFTWARE: ')

ff.add_history_entry('DOCUMENT: TBD')

ff.write_to(output_file)

stscijgbot-jp commented 3 years ago

Comment by David Law on JIRA:

Thanks! So I guess the history is a property of the asdf file, not of the model object within it.

With that change I've now created updated model-based asdf versions of both the extract1d and apercorr reference files. Since the apercorr data model is out of date I started with the extract1d model and rewrote the model name in the metadata; hopefully that works ok for now.

See the usual Box folder for new versions of both dated today (MJD 59152)

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

David Law

I grabbed both of those asdf files.

So for MIRI the apcor file is not going to depend on channel or band correct, but for NIRSpec it will. For MIRI we could make the default have a channel and band label of some type but they are 'ANY' right now.

James Muzerolle I believe the plan for the aper cor ref file is if the ifu contains data for more than 1 grating/band then we will use a default general set of arrays. If the data is from a single grating/band then we will use the grating or band for that specific case. It is still a bit trick what to do in the overlap cases because of the different resolutions but I think we are not worrying about that now. I just need to figure how to make the datamodel for NIRSPEC.

stscijgbot-jp commented 3 years ago

Comment by David Law on JIRA:

If it helps with making a single datamodel for the two IFUs I can tweak the MIRI case to match, just let me know. The latest code to make the MIRI files is at https://github.com/STScI-MIRI/miri3d/blob/master/miri3d/x1d/make_x1d.py by the way.

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

David Law

Ok I had this working inside the python test code. But now I can not seem to read the asdf file you created.

from jwst import datamodels

model = datamodels.IFUExtract1dModel('miri-extract1d-59152.asdf') but when look at model.data.wavelength it has size 0 Can you see what I am doing wrong ?

stscijgbot-jp commented 3 years ago

Comment by David Law on JIRA:

Huh- for whatever reason the version of the code in which I start with

model = datamodels.IFUExtract1dModel()

add the necessary pieces, and then write it out isn't working. Or at least, it's writing out the file and I can see the radius/wavelength/etc arrays populating, but when reading it back in using the datamodel they're all empty.

On the other hand, if I create the asdf file from scratch ignoring the datamodel and constructing the dict manually it seems to read back in using the datamodel just fine:

aa=datamodels.open('miri-extract1d-59153.asdf')

aa.data.wavelength

I've no idea why, but for the time being I've regenerated both reference files using the manual approach so that they should work (now MJD 59153)

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

James Davies David Law David Law and I have both come to the same conclusion independently. When we make an asdf file using the datamodels we cannot read it in using the datamodels. But when we make an asdf array using a dictionary we can read it in using the datamodels. Makes no sense. For the file created using datamodels I can read in the meta data but not the data array data. Everything is read in fine if I just use the file created using a dictionary . I have put an example of the an asdf file created using a datamodel (and the datamodel) miri-extract1d-59121_datamodel.asdf and and file created using a dictionary - miri-extract1d-59121_dict.asdf. I also have the routine the create the files and how I tried to read in the files (read_ifu_extract1d.py). If you have time could you look at the two files. The files are at ████████████████████████████████ The only difference I can see is that one created using the dictionary has a bit of a different lay out for how the data arrays are stored: dictionary one: data: axis_pa: !core/ndarray-1.0.0 data: [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, (edited) datamodel one: data.axis_pa: !core/ndarray-1.0.0 data: [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, (edited)

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

David Law James Muzerolle

I updated the ifuextract1d and mirmrs_apcorr schemas. David I made 'data' 'apcorr_table' to be inline with how the other schemas work.

I took the fits MIRI files you sent me used a dictionary to convert them to an asdf file. I am still trying to track down why when we use the datamodel to create the reference file it does not read in correctly. David when you have time look at GitHub #pr 5383 it has the new mirmrs_apcorr schema you can use the make the reference file. I created one for testing. For reference I have put how I have converted the fits to asdf on ████████████████████████████████

I next have to update the nirspec apcorr schema for ifu data. James Muzerolle I have lost track did you make up a reference file for this type of data ? It is ok if you did not I think I know what we need and I can start creating the datamodel. I THINK (??) that the datamodel will be more complex than for miri - an apcorr table for each type of data. The extract1d step will read the s3d header and figure out which grating and filters were used. So I think it makes sense to put all the apcorr tables in a single asdf file - rather than splitting them into ones based on grating/filter. The extract1d code will then pull out the apcorr table that is needed. Make sense ?

stscijgbot-jp commented 3 years ago

Comment by James Muzerolle on JIRA:

Jane Morrison I haven't had the chance to create a new apcorr file, so go ahead with the asdf implementation if you're ready. It should be fine to have a single file with multiple tables, that's how the current combined MOS+IFU fits file is structured. You can use that file to get the nominal wavelength intervals for each disperser. I'm adding Graham Kanarek as a watcher since he created the original version and will likely be the one to submit revisions in the future. Also adding Beth Sargent so she's in the loop. Just to be clear, are we talking about having the asdf version submitted directly to CRDS? If so, will you be able to share your code for creating it? I'd like to avoid having to duplicate effort, since we don't have anything already in place.

jemorrison commented 3 years ago

@jmuzerolle
I will take a look at the existing NIRSpec Apcorr file. I have started creating a new schema & datamodel just for NIRSPEC IFU data. I will definitely share the code and then you will have to probably fill it in with real data for radius and apcor. For MIRI these are have 2 dimensions - 3 X 2346. Are you planning on having a similar type 2 D array or just a 1 d ? Creating the reference files in asdf has a bit of a learning curve. The routine I will send you uses dictionaries to create the asdf file. Eventually when we are happy with the files these will go into CRDS and replace existing files (in the case of the new extract1d ref file) or add to the existing ref files (in the case of the apcorr file for NIRSpec IFU data)

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

James Muzerolle Hum the current apcor file for MOS and IFU data has the appcor as 3 dimensions ? Wavelength 1 D size 2028 size 2 D (2048, 3)

apcorr 3D (3,2048,3) Is it 3 D because of something it is also used with MOS. Could you explain the 3D and if I only need to pull out part of the array for the IFU apcorr.