Open keflavich opened 10 months ago
Thanks for opening the issue. It appears you're using a JWST file. Do you have stdatamodels or the jwst package installed in your environment?
As you've already found AsdfInFits
is no longer supported. The deprecation of this feature in 2.15 included a warning and a description about how to migrate to stdatamodels
. This deprecation (and the warning) was removed in 3.0 (along with removal of AsdfInFits
). The docs still contain an explanation of the migration: https://asdf.readthedocs.io/en/latest/asdf/deprecations.html#asdf-in-fits-deprecation
with a link to the stdatatmodels documentation for equivalent functions:
https://stdatamodels.readthedocs.io/en/latest/asdf_in_fits.html
Since this appears to be a jwst file, you might consider opening it with the jwst datamodels api: https://stdatamodels.readthedocs.io/en/latest/jwst/datamodels/models.html#opening-a-file
That's interesting; the deprecation warning did not appear in 2.15.1 - you can see the full output I received above. I've verified that there is no warning in a fresh session too:
$ /blue/adamginsburg/adamginsburg/miniconda3/envs/python39/bin/python -c "import asdf; print(asdf.__version__); asdf.open('F200W/pipeline/jw01182004001_04101_00001_nrca1_cal.fits')"
2.15.2
I'll look more at datamodels. It's unclear to me whether I can write datamodels back out, though. Maybe that's instead handled in https://stdatamodels.readthedocs.io/en/latest/asdf_in_fits.html.
Thanks for sharing the example. Would you try it with -X dev
to trigger python development mode (to show deprecation warnings)? In hindsight we should have made this a more obvious warning to catch uses like yours.
The datamodels api does include a save method. This is the api used throughout the jwst
pipeline but it is quite different from the more simplified AsdfInFits
. For your uses the asdf_in_fits api you linked might also work (and might be easier to use) and is a much closer match to the previous AsdfInFits
api.
Please let me know if I can help and sorry for any disruption this caused. We hoped to make the changes in a way that offered advance notice but now I think we should have used a more prominent warning.
Thanks again for opening this. I pinned this issue so that hopefully other folks that encounter this issue will see this discussion.
One jarring difference between asdf
and asdf_in_fits
is that asdf.write_to(..., overwrite=True)
was required to overwrite an existing file, while asdf_in_fits
does not accept the overwrite
keyword.
Hmmm, I'm not sure I'm testing the same thing you are as I'm unable to replicate this locally.
Both AsdfInFits.write_to and stdatamodels.asdf_in_fits.write pass **kwargs
to astropy.io.fits.HDUList.write_to
If with asdf 2.15.2 and stdatamodels 1.4.0 I create an AsdfInFits
instance and save it I am required to pass overwrite=True
the second time
import asdf.fits_embed, stdatamodels.asdf_in_fits
af = asdf.fits_embed.AsdfInFits()
af.write_to('foo.fits')
af.write_to('foo.fits') # OSError
af.write_to('foo.fits', overwrite=True) # no error
af = asdf.AsdfFile()
stdatamodels.asdf_in_fits.write('bar.fits', af.tree)
stdatamodels.asdf_in_fits.write('bar.fits', af.tree) # OSError
stdatamodels.asdf_in_fits.write('bar.fits', af.tree, overwrite=True) # no error
Would you share an example of the issue you ran into showing which function wasn't accepting overwrite
?
OK, I have a few competing problems that all stem from switching from asdf -> stdatamodels.
/blue/adamginsburg/adamginsburg/miniconda3/envs/python39/bin/python -c "from stdatamodels import asdf_in_fits as asdf; fa = asdf.open('F444W/pipeline/jw01182004001_04101_00007_nrcalong_destreak.fits'); fa.write_to('test.fits'); fa.write_to('test.fits')"
that works, no errors (it shouldn't; I overwrite test.fits)
$ /blue/adamginsburg/adamginsburg/miniconda3/envs/python39/bin/python -c "from stdatamodels import asdf_in_fits as asdf; fa = asdf.open('F444W/pipeline/jw01182004001_04101_00007_nrcalong_destreak.fits'); fa.write_to('test.fits', overwrite=True);"
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/blue/adamginsburg/adamginsburg/miniconda3/envs/python39/lib/python3.9/site-packages/asdf/asdf.py", line 1427, in write_to
_handle_deprecated_kwargs(config, kwargs)
File "/blue/adamginsburg/adamginsburg/miniconda3/envs/python39/lib/python3.9/site-packages/asdf/asdf.py", line 1831, in _handle_deprecated_kwargs
raise TypeError(msg)
TypeError: Unexpected keyword argument 'overwrite'
but also, I'm finding write_to
is writing invalid files. More to come, maybe.
To clarify that a bit, I am using 2.15.2 because I reverted asdf.
/blue/adamginsburg/adamginsburg/miniconda3/envs/python39/bin/python -c "import asdf; print(asdf.__version__); import stdatamodels; print(stdatamodels.__version__); from stdatamodels import asdf_in_fits; fa = asdf_in_fits.open('F444W/pipeline/jw01182004001_04101_00007_nrcalong_destreak.fits'); fa.write_to('test.fits', overwrite=True);"
2.15.2
1.8.3
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/blue/adamginsburg/adamginsburg/miniconda3/envs/python39/lib/python3.9/site-packages/asdf/asdf.py", line 1427, in write_to
_handle_deprecated_kwargs(config, kwargs)
File "/blue/adamginsburg/adamginsburg/miniconda3/envs/python39/lib/python3.9/site-packages/asdf/asdf.py", line 1831, in _handle_deprecated_kwargs
raise TypeError(msg)
TypeError: Unexpected keyword argument 'overwrite'
Thanks for sharing the examples. stdatamodels.asdf_in_fits
is not a drop-in for AsdfInFits
or asdf
which explains the errors and file issues you're seeing.
From the asdf_in_fits docs open
returns an asdf.AsdfFile
instance (note this is not an asdf.fits_embed.AsdfInFits
). For your example:
from stdatamodels import asdf_in_fits as asdf
fa = asdf.open('F444W/pipeline/jw01182004001_04101_00007_nrcalong_destreak.fits')
fa.write_to('test.fits')
fa.write_to('test.fits')
fa
is an asdf.AsdfFile
instance so calling write_to('test.fits')
writes an ASDF file to test.fits
(note that even though this has a fits
file extension it is not a FITS file. Instead you likely want to use something like:
from stdatamodels import asdf_in_fits # let's leave this as asdf_in_fits to not confuse this with asdf
fa = asdf_in_fits.open('F444W/pipeline/jw01182004001_04101_00007_nrcalong_destreak.fits')
# here fa is an asdf.AsdfFile instance, if we want to write this to a fits file we need to use asdf_in_fits
asdf_in_fits.write('foo.fits', fa.tree)
The TypeError
you shared is due to passing an overwrite
argument to AsdfFile.write_to which doesn't support (or need) that argument.
Thanks, that makes some sense. But what I'm gathering is that there's not a drop-in replacement for AsdfInFits
, so I have to completely rethink my code if I upgrade to asdf 3.0. I don't understand the asdf data model enough to parse these instructions:
https://stdatamodels.readthedocs.io/en/latest/asdf_in_fits.html
What I'm trying to do, in asdf 2.15.2 language, is this:
af = asdf.fits_embed.AsdfInFits.open('F444W/pipeline/jw01182004001_04101_00007_nrcalong_destreak.fits')
af.write_to('test.fits', overwrite=True)
I think I have to do something like:
from stdatamodels import asdf_in_fits # let's leave this as asdf_in_fits to not confuse this with asdf
from astropy.io import fits
filename = 'F444W/pipeline/jw01182004001_04101_00007_nrcalong_destreak.fits'
fa = asdf_in_fits.open(filename)
fh = fits.open(filename)
asdf_in_fits.write(filename='foo.fits', tree=fa.tree, hdulist=fh)
but this makes me uncomfortable on many levels:
The example you shared for updating your code to use asdf_in_fits
looks ok but could be simpler (supplying an hdulist is optional) . Did this example work for you?
from stdatamodels import asdf_in_fits
fa = asdf_in_fits.open('F444W/pipeline/jw01182004001_04101_00007_nrcalong_destreak.fits')
asdf_in_fits.write('foo.fits', fa.tree)
This is very similar to the asdf 2.15 example you shared. The biggest change is that write
is not a function available from the asdf_in_fits
module instead of write_to
being a method on an AsdfInFits
instance. This change was necessary as there is no longer any AsdfInFits
class. I don't quite understand your comment that 'it's no longer writing out an object with known properties'. It doesn't matter if this is a function in a module or a method on a class, both produce equivalent files. As far as I'm aware neither AsdfInFits
nor asdf_in_fits
makes any guarantee about the order of HDUs (outside of the ASDF hdu appearing last).
OK, thanks. It wasn't obvious to me that asdf_in_fits.open
preserved all of the FITS HDUs - I had (incorrectly) assumed that it was extracting the ASDF HDU from the file and therefore that the FITS HDUs had to be manually added back in.
asdf_in_fits.open
will convert the fits HDUs referenced in the asdf tree to arrays when constructing the AsdfFile
instance. I expect this is not what you're hoping for given your last comment. For example, taking an example jwst file, if I inspect the structure (using HDUList.info
) I get the following:
>> ff = astropy.io.fits.open('jw01024001001_04101_00001_mirifulong_rate.fits')
>> ff.info()
Filename: jw01024001001_04101_00001_mirifulong_rate.fits
No. Name Ver Type Cards Dimensions Format
0 PRIMARY 1 PrimaryHDU 234 ()
1 SCI 1 ImageHDU 60 (1032, 1024) float32
2 ERR 1 ImageHDU 10 (1032, 1024) float32
3 DQ 1 ImageHDU 11 (1032, 1024) int32 (rescales to uint32)
4 VAR_POISSON 1 ImageHDU 9 (1032, 1024) float32
5 VAR_RNOISE 1 ImageHDU 9 (1032, 1024) float32
6 ASDF 1 BinTableHDU 11 1R x 1C [7076B]
If I open this file with asdf_in_fits.open
then write it with asdf_in_fits.write
(without mapping the data arrays to HDUs) asdf_in_fits
will not automatically create HDUs for the data arrays as it does not know where any given array should be written.
>> af = stdatamodels.asdf_in_fits.open('jw01024001001_04101_00001_mirifulong_rate.fits')
>> stdatamodels.asdf_in_fits.write('foo.fits', af.tree)
>> ff = astropy.io.fits.open('foo.fits')
>> ff.info()
Filename: foo.fits
No. Name Ver Type Cards Dimensions Format
0 PRIMARY 1 PrimaryHDU 4 ()
1 ASDF 1 BinTableHDU 11 1R x 1C [21142735B]
As this is a jwst file there are many benefits to using the stdatamodels.jwst.datamodels
interface (including mapping data to HDUs and tree meta data to fits keywords). Using the same example file above:
>> import stdatamodels.jwst.datamodels
>> m = stdatamodels.jwst.datamodels('jw01024001001_04101_00001_mirifulong_rate.fits')
>> m.save('foo.fits')
>> ff = astropy.io.fits.open('foo.fits')
>> ff.info()
Filename: foo.fits
No. Name Ver Type Cards Dimensions Format
0 PRIMARY 1 PrimaryHDU 234 ()
1 SCI 1 ImageHDU 60 (1032, 1024) float32
2 ERR 1 ImageHDU 10 (1032, 1024) float32
3 DQ 1 ImageHDU 11 (1032, 1024) int32 (rescales to uint32)
4 VAR_POISSON 1 ImageHDU 9 (1032, 1024) float32
5 VAR_RNOISE 1 ImageHDU 9 (1032, 1024) float32
6 ASDF 1 BinTableHDU 11 1R x 1C [7038B]
Where is stdatamodels.jwst
documented? I get a null result here:
https://stdatamodels.readthedocs.io/en/latest/search.html?q=jwst&check_keywords=yes&area=default
I guess we have to go to the JWST models?
https://jwst-pipeline.readthedocs.io/en/latest/jwst/user_documentation/datamodels.html
I'll investigate, but I need access to the ASDF and to the FITS objects, which seems to take quite some digging to find in the datamodels.
Most of the documentation can be found here: https://stdatamodels.readthedocs.io/en/latest/jwst/datamodels/index.html
I can't make much sense of the JWST data models. They're documented: https://stdatamodels.readthedocs.io/en/latest/jwst/datamodels/index.html but the fundamental object I'm actually working with is the GWCS, which I can't find in the data model but can easily find in the ASDF.
af = asdf.fits_embed.AsdfInFits.open('F444W/pipeline/jw01182004001_04101_00007_nrcalong_destreak.fits')
af.tree['meta']['wcs']
vs
af = stdatamodels.jwst.datamodels.open('F444W/pipeline/jw01182004001_04101_00007_nrcalong_destreak.fits')
af.info() # I can see some features here, like meta:
meta = af['meta']
dir(meta) # shows me that there is a wcsinfo attribute
meta.wcsinfo
that meta.wcsinfo
isn't a GWCS instance, as it is for the ASDF. I don't see WCS referenced anywhere in the JWST datamodel (except spectral wcs, which is not relevant here).
Would you open an issue over at stdatamodels? https://github.com/spacetelescope/stdatamodels/issues
I'd like to leave this issue open in case other folks run into similar errors.
Huh, my approach for searching for keywords failed. There is a meta.wcs
attribute, and that's the thing to modify, following:
https://github.com/spacetelescope/jwst/blob/9dfaa37241e86e8feaa264de083ad68c85bcac08/jwst/scripts/adjust_wcs.py#L250
@keflavich it would helpful to us to know if you using JWST data or not. The datamodels solution is likely not very useful for non-JWST data. And it is on our todo list to add a more generic way of dealing with ASDF in FITS. One question about that: do you want array data within the ASDF stored automatically in FITS extensions or not?
I am working with JWST data, and I have no strong opinions: this is my first time working with GWCS and ASDF, and I'm only touching the ASDF parts because I need to modify them at a point along the JWST pipeline. My use case is therefore quite limited, and I was seeking the most expedient way to make those adjustments. I suspect my problems above all stem from copying code from someone/somewhere else that should have used datamodels under the hood but instead used low-level asdf access, which forced me to change things after an update.
That said, I'd really like there to be symmetry between asdf io and fits io when possible. e.g., I should be able to trust that the HDU order stays the same if I read in/write out HDUs, the write
methods should have the same generic conventions (that are now common across astropy?), etc.
I upgraded asdf and can no longer read files. Example:
This worked on earlier versions:
I see that this is intentional: https://github.com/asdf-format/asdf/pull/1288 but it broke my workflows and was a very surprising error. I'd like to request a clearer error message and guide to migration.