gbrammer / grizli

Grizli: The "Grism redshift and line" analysis software
MIT License
67 stars 51 forks source link

"NIRISS Fix" seemingly isn't propogated everywhere #191

Open TheSkyentist opened 9 months ago

TheSkyentist commented 9 months ago

The NIRISS fix in the visit_preprocessor has the intended effect of appending an "n" to the NIRISS filters, presumably with the intent of distinguishing them from the NIRCam filters which can share the same name.

https://github.com/gbrammer/grizli/blob/3b3f09da271800163443c0d09091a35a1aa6d81f/grizli/aws/visit_processor.py#L2188C1-L2196C67

After parsing visits, preprocessing, updating exposure footprints, drizzling full mosaics, making filters combinations, and creating a multi-band catalog, I receive no errors and all outputs appear normal. However, all of the files created in the Prep directory use the original NIRISS filter names. These would presumably conflict if I was processing NIRCam data simultaneously.

This then causes a problem when I attempt to do the grism prep step. When the pipeline searches for the reference image:

https://github.com/gbrammer/grizli/blob/3b3f09da271800163443c0d09091a35a1aa6d81f/grizli/pipeline/auto_script.py#L3152C1-L3163C50

as it searchers for the filter name plus "n-clear" which haven not been created.

As referenced in #182, the short term correction has been to simply remove the "n-clear" addition from the grism prep step, but I believe a deeper fix is needed for cases where there are programs with both NIRISS and NIRCam grism.

Perhaps I am doing something wrong, but am happy to try and chat about this further. Perhaps the visit_preprocessor is not running correctly on my data. Happy to discuss further and provide additional information.

gbrammer commented 9 months ago

What exactly do you mean by "all of the files created in the Prep directory use the original NIRISS filter names"? I think that will be true for the mosaics created during preprocessing from the individual visit exposure groups, but those are more for quicklook and probably shouldn't be what you use for the reference images.

I'd recommend remaking the mosaics with whatever combination of exposures / filters you want to consider together with explicit drizzle parameters and something like

# in a directory with a bunch of NIRISS `rate.fits` files, perhaps both imaging and spectroscopy

import glob
import astropy.wcs as pywcs
from grizli import utils
from grizli.aws import visit_processor

# Make a table with file information
files = glob.glob('*rate.fits')
files.sort()
res = visit_processor.res_query_from_local(files=files)

# Mosaic wcs that contains the exposures, but could come from somewhere else
hdu = utils.make_maximal_wcs(files=files, pixel_scale=0.04, pad=4, get_hdu=True)
wcs = pywcs.WCS(hdu.header)

# Make the mosaics, e.g., ``myfield-f200wn-clear_drc_sci.fits``
root = 'myfield'
_ = visit_processor.cutout_mosaic(root, res=res,
        kernel='square',
        half_optical=False,
        pixfrac=0.8,
        clean_flt=False,
        s3output=None,
        make_exptime_map=False,
        weight_type='jwst')

That will make mosaics in whatever filters are found with the filter names the grism modeling code expects, e.g., f200wn-clear.

gbrammer commented 9 months ago

@TheSkyentist, @lboogaard : I added a notebook at https://github.com/gbrammer/grizli-notebooks/blob/main/JWST/grizli-niriss-2023.ipynb to demonstrate the way that I currently prefer to process JWST data, through to extracting and fitting NIRISS spectra. It shows how to use the internal tools to make the necessary reference mosaics and catalogs, which have the expected file naming structure.

TheSkyentist commented 9 months ago

I will have to take a closer look at this, but this seems on the surface incredibly helpful! Thank you for taking the time to put this together.

On the surface, it appears that for working with NIRISS data one should use the aws.visit_processor functions as opposed to the pipeline.auto_script functions. I was using auto_script.make_combined_mosaics to create my drizzled reference images, but this does not run the aws.cutout_mosaic which does not correct the NIRISS filter names.

Are there similar difference between auto_script.preprocess and visit_processor.process_visit?

I'll take some time to run through this in it's entirety and I will report back.

lboogaard commented 9 months ago

Thanks for sharing this!

gbrammer commented 9 months ago

Visit_processor is just a wrapper around pipeline.auto_script.go that uses the associations and some functionality for automating passing non default parameters. And also for the mode of running large batches of associations on AWS (there are something like 20,000 of them now). It just stops after the preprocessing steps (and sends the products to AWS if run in that way and with the necessary permissions). 

I don’t really use the full end-to-end auto_script pipeline that truly automatically does everything down to running the WFSS models and extractions, mostly because the visits/associations are broken into smaller pieces now for better parallelization. So you have to combine visits/associations of the same field a bit more manually like I do in the new example notebook.  But the database API examples in the notebook should give you an idea how to programmatically figure out what associations could combined.

In fact, the way I make the big field mosaics is exactly with the cutout_mosaic function, but where you don’t have to already have the exposures around as in the notebook. You just pass it a WCS and it figures out what exposures are available that touch that WCS, downloads them, and builds the mosaic from them with old-school drizzle. However, that currently requires access to the database and doesn’t yet work with the API.

TheSkyentist commented 9 months ago

Ah so I think that this brings us to the root of the issue. As the visit_processor is a wrapper around auto_script, it will apply certain changes that are not propagated if you just use the auto_script.

However, the problem arises where these changes are still assumed even if you only use the auto_script functions, such as when the GroupFLT is loaded: https://github.com/gbrammer/grizli/blob/3b3f09da271800163443c0d09091a35a1aa6d81f/grizli/pipeline/auto_script.py#L3076

So if one tries to use the auto_script to make the detection images for NIRISS data, these can then not be used to build the grism models as they will have the incorrect names that would have been modified has the detection images been made with the visit_processor.

I'm not sure what the fix to this is, other than to include these changes in the default auto_script, with the relevant change being the demarcation of NIRISS filters.

TheSkyentist commented 9 months ago

Pinging this issue again with some ideas of how to move forward.

Currently there is a kwarg fix_niriss that controls whether or not the NIRISS fix is applied in visit_processor'. However regardless of what the user chooses, it is assumed they chose to fix the NIRISS keywords in theauto_script'. There are therefore two options:

  1. Ensure that the NIRISS fix is applied everywhere within the auto_script module and make it a necessity rather than a user specified choice. The most relevant location where it needs to be added would be in make_combined_mosaics. Are there any other places where this patch will need to be applied?
  2. Track the arguement niriss_fix' whether processing in thevisit_processoror theauto_script' and apply the fix accordingly.

Are there any preferences on how to proceed?

gbrammer commented 9 months ago

I think the preferred fix would be to modernize the auto_script script to use the new code to make the mosaics and combined images. That should solve the file name issues and also ensure that the mosaics are created with the most recent bad pixel masks, WCS, drizzled parameters, etc.