spacetelescope / jwst

Python library for science observations from the James Webb Space Telescope
https://jwst-pipeline.readthedocs.io/en/latest/
Other
560 stars 167 forks source link

Make TweakReg accept astropy Table object for abs_refcat #8132

Open izkgao opened 9 months ago

izkgao commented 9 months ago

My current strategy for TweakReg involves processing images in batches instead of all at once. However, reading the large reference catalog (abs_ref) repeatedly in each loop is a major bottleneck. I am exploring alternative approaches, such as pre-loading the entire catalog and efficiently accessing it within the loop, but TweakReg does not directly accept astropy table objects as input. I am not familiar with the spec section in stpipe.step. I noticed that the data types that appear in the spec sections of many steps in this repository are basic types like integer, float, and string. Can it allow passing an astropy table object as input to a step?

jdavies-st commented 7 months ago

I agree that this would be very useful.

I also think that the JWST pipeline should eat its own dog food - TweakRegStep should accept the output of SourceCatalogStep as an input to abs_refcat. This is of course a QTable which has its RA, Dec bundled as a SkyCoord object. Currently when trying to pass such a table, we get:

  File "/home/jdavies/miniconda3/envs/jwst/lib/python3.11/site-packages/jwst/tweakreg/tweakreg_step.py", line 442, in process
    align_wcs(
  File "/home/jdavies/miniconda3/envs/jwst/lib/python3.11/site-packages/tweakwcs/imalign.py", line 602, in align_wcs
    raise KeyError("Reference catalogs *must* contain *both* 'RA' "
KeyError: "Reference catalogs *must* contain *both* 'RA' and 'DEC' columns."

The output of SourceCatalogStep has the relevant columns xcentroid, ycentroid, and sky_centroid where:

In [123]: result['sky_centroid']
Out[123]: 
<SkyCoord (ICRS): (ra, dec) in deg
    [(215.05262063, 52.99775802), (215.04071803, 52.98944   ),
     (215.02579846, 52.97885442), ..., (         nan,         nan),
     (         nan,         nan), (         nan,         nan)]>
jdavies-st commented 7 months ago

In order to accept an astropy.table.Table object, one would need to make a custom validator for spec, as we do for DataModel objects, i.e. string_or_datamodel. Or just don't validate it at all.

Currently, this is what needs to be done to convert the results of SourceCatalogStep into something that TweakRegStep won't complain about. Not the column names RA and DEC need to be uppercase.

from jwst.step import SourceCatalogStep
from astropy.table import Table
import numpy as np

def make_catalog(image):
    '''Build a source catalog from an image to use as abs_refcat in tweakreg'''
    result = SourceCatalogStep.call(image, npixels=10, deblend=True)

    # result is a QTable with SkyCoords; we need RA, DEC
    id = result['label']
    x = result['xcentroid']
    y = result['ycentroid']
    ra = result['sky_centroid'].ra.value
    dec = result['sky_centroid'].dec.value
    flux = result['aper_total_flux']
    # Cull sources that have NaNs in RA, Dec
    mask = ~np.isnan(ra)

    column_names = ['id', 'x', 'y', 'RA', 'DEC', 'flux']
    return Table([id[mask], x[mask], y[mask], ra[mask], dec[mask], flux[mask]],
                 names=column_names)

I will add that really all one needs for abs_refcat in the end is a SkyCoord array. Everything else is superfluous.