spacetelescope / jwst

Python library for science observations from the James Webb Space Telescope
https://jwst-pipeline.readthedocs.io/en/latest/
Other
564 stars 167 forks source link

Modify NIRSpec IFU WCS to improve runtime and usability #8338

Closed stscijgbot-jp closed 4 weeks ago

stscijgbot-jp commented 7 months ago

Issue JP-3564 was created on JIRA by David Law:

At present, it takes much longer to build NIRSpec IFU cubes than it does to build MIRI cubes.  For a typical single-exposure, cube this is about 9 seconds for an MRS cube and 45 seconds for a NIRSpec cube.  While NIRSpec has more pixels, the runtime is driven by the 36 seconds spent in the subroutine map_nirspec_pixel_to_sky compared to 1 second spent in the corresponding map_miri_pixel_to_sky routine.

At the same time, it is not possible to interact with the NIRSpec IFU WCS in the same manner as for other JWST instruments.  Typically,

model.meta.wcs.transform(‘detector’,’world’,x,y)

would give the world coordinates associated with any pixel.  For NIRSpec though, it’s necessary (both in cube building and anywhere else working with a cal.fits file) to call nirspec.nrs_wcs_set_input for each of the 30 slices in turn on all of the input xx,yy pixels, and save the meaningful results.  This is what cube building does in the map_nirspec_pixel_to_sky routine, querying the bounding box for each to figure out the affected pixels.

This loop is what’s slow; 75% or more of the map_nirspec_pixel_to_sky runtime seems to be spent just opening the relevant distortion model with nirspec.nrs_wcs_set_input for each of 30 slices rather than actually evaluating the model at any given location.

So, can we cut out the loop somehow, perhaps by changing the assign_wcs routine itself?  MIRI uses a regions reference file to assign each detector pixel to a given slice; this is read from a static reference file during assign_wcs, attached to the data, and used to map each pixel to the appropriate slice distortion model.  NIRSpec has no such static file, because the mapping from pixels to slices can change slightly from exposure to exposure, but it should be possible to generate one on the fly during the assign_wcs stage and save it for future use.

This would certainly address the general-purpose usability, though it’s unclear to me offhand whether it would necessarily improve runtime depending on whether it could speed up distortion loading time.

Thoughts?  Nadia Dencheva Melanie Clarke Peter Zeidler Jane Morrison 

stscijgbot-jp commented 7 months ago

Comment by Melanie Clarke on JIRA:

If it can be done, I think this change would have definite benefits for user. There is currently no way to access the WCS for specific slices without calling the internal pipeline function to compute it on the fly.  Providing the transform directly in the data product instead would be useful, in the pipeline and outside of it.

stscijgbot-jp commented 7 months ago

Comment by David Law on JIRA:

See also https://jira.stsci.edu/browse/JP-930, which may no longer be necessary depending on how this works out.

stscijgbot-jp commented 7 months ago

Comment by Nadia Dencheva on JIRA:

It's possible to generate a pixel map for Nirspec. However, the only optimization here is bypassing generating pixel coordinates; the loop over slices still exists as it does for MIRI IFU. There's another difference between Nirspec and MIRI IFU transforms - each NRS slice evaluates ~100 transform, while for MIRI the number is a few.

That said we should look into optimizing the NIRSpec WCS even if a pixel map is not the solution.

stscijgbot-jp commented 7 months ago

Comment by David Law on JIRA:

Nadia Dencheva However you think this is best done.  The two things I'd be most interested in achieving are better runtime and the ability to call data models like the other instruments without a user-implemented slice loop (model.meta.wcs.transform(‘detector’,’world’,x,y) )

stscijgbot-jp commented 4 months ago

Comment by David Law on JIRA:

Nadia Dencheva Based on an offline conversation with Timothy Brandt , it may be particularly useful to look at https://github.com/spacetelescope/jwst/blob/master/jwst/assign_wcs/nirspec.py#L1618

Since so many parts of the spec2 pipeline (including cube building, but also many other steps) end up calling this assign_wcs module it results in a vast number of calls to deepcopy which dominate the overall spec2 runtime.  Any thoughts on alternatives?

stscijgbot-jp commented 4 months ago

Comment by Timothy Brandt on JIRA:

Following David's comment, the wcs object is not pickleable (I tried), and a shallow copy is indeed insufficient.  Astropy wcs objects should, I think, contain copy() and deepcopy() methods, but the wcs object here does not.  A shallow copy takes almost no time, while a deep copy takes ~1 second per call on my laptop, which really adds up.  Perhaps the easiest fix is to add a copy() method (or similar) to this class that provides the minimum depth of copy needed for these NIRSpec tasks?

stscijgbot-jp commented 4 months ago

Comment by Timothy Brandt on JIRA:

David Law Nadia Dencheva Melanie Clarke I have a proposed fix here:

https://github.com/t-brandt/jwst/tree/nirspec_deepcopy_fix

It removes nearly all of the deep copying for IFU mode and much of it for MOS mode, I think.  There is still a deep copy in extract_2d that I am not yet confident I can get rid of, at least not easily, as it attaches a (deep copied) wcs object to each slit's spectrum.  I have not made a pull request; I intend to wait until after talking to David Law and Nadia Dencheva next Friday.  Melanie Clarke, it significantly improves the run time of your IFU NSClean notebook.

stscijgbot-jp commented 4 weeks ago

Comment by Melanie Clarke on JIRA:

Fixed by #8793

stscijgbot-jp commented 3 days ago

Comment by David Law on JIRA:

Non-trivial to disentangle these speed gains from other worked in the same build, but the net effect on an example 4-pt dithered NIRSpec IFU cube looks like:

spec2 runtime decreases from 522 sec to 162 sec

spec3 runtime decreases from 248 sec to 115 sec

Total spec2+spec3 runtime drops from about 13 min to 5 min, while cubes are unchanged.  Closing!