spacetelescope / jwst

Python library for science observations from the James Webb Space Telescope
https://jwst-pipeline.readthedocs.io/en/latest/
Other
556 stars 164 forks source link

Handling of NaNs in TSO photometry #8019

Closed stscijgbot-jp closed 1 month ago

stscijgbot-jp commented 10 months ago

Issue JP-3444 was created on JIRA by Sarah Kendrew:

When performing TSO photometry, NaN's are not masked. As a result, the presence of a single NaN in the apertures produces a NaN result in the output product in MAST. For a reasonable number of NaNs in a mean/median operation as performed on the background pixels, NaN'd pixels can safely be excluded from the computation without severe impact on the accuracy. I would suggest NaNs are masked.

Perhaps a comment or keyword can be added to the ecsv schema that flags if 1 or more NaNs were present in the background annulus, to signal to the user that they may want to look at the data more closely. 

This issue was raised following a user report in Helpdesk ticket INC0194218.

stscijgbot-jp commented 5 months ago

Comment by David Law on JIRA:

Adding a note that after https://jira.stsci.edu/browse/JP-3566 and https://jira.stsci.edu/browse/JP-3570 are finished, the issue reported here may be moot.  To be reconsidered after that.

stscijgbot-jp commented 4 months ago

Comment by Ian Wong on JIRA:

NaN handling issues persist in TSO photometry, where NaN values in photometric or background aperture lead to nan values in photometry file.

Recommended fix:

Caveat:

stscijgbot-jp commented 2 months ago

Comment by David Law on JIRA:

Looks like this should be a fairly straight forward fix, and can confirm what Ian Wong wrote above.  The implementation of photometric extraction in tso_photometry.py is too simplistic to be able to handle NaNs, which is more obvious in the large background annulus.  Sending over to Tyler Pauly 

stscijgbot-jp commented 2 months ago

Comment by David Law on JIRA:

Prompted by this I did a little digging into other science cases as well:

 

stscijgbot-jp commented 1 month ago

Comment by Jane Morrison on JIRA:

Sarah Kendrew Do you by chance have some data that shows the problems with the NaNs.

 

stscijgbot-jp commented 1 month ago

Comment by Jane Morrison on JIRA:

Melanie Clarke  Tyler Pauly I found the data associated with Helpdesk ticket INC0194218.

I am new to working with TSO data in calwebb_tso3 pipeline. Unless I am accessing MAST incorrect I could not find a level 3 association for  jw03384001001_03101_00001-seg001_mirimage_calints.fits.

So I just made up an association containing calwebb_tso3 jw03384001001_03101_00001-seg001_mirimage_calints.fits

Running calwebb_tso3 on this association produced a jw03384001001_03101_00001-seg003_mirimage_phot.ecsv file which has  NANs for the last 8 columns. Which I believe is the error this ticket is to fix.

I just wanted to make sure I have set up the processing correct. I was not sure why there was no level 3 association. 

 

 

stscijgbot-jp commented 1 month ago

Comment by Tyler Pauly on JIRA:

We chatted about this during standup, but just to keep a log of the discussion here: the tso3 association should have been present on MAST along with level 3 products, but they are missing. However, the pool file will build it, and it is available. It should contain the three segments present in the observation. I gave it a try and generated a white_light output containing entries from all segments, though I also see the columns full of NaNs throughout.

stscijgbot-jp commented 1 month ago

Comment by Jane Morrison on JIRA:

David Law Sarah Kendrew 

I was looking for some similar data that did not have this problem to test if my fix resulted in the same result when no Nans are present. I searched MAST for MIRIMAGE, 256 X 256 and TSOVISIT=True and I only found one other program: 3730. The photometry escv files for that data also had 8 columns of NANS.

I just wanted to check it that seemed correct to you. The Tso photometry step has never worked for  this type of data. 

 

stscijgbot-jp commented 1 month ago

Comment by David Law on JIRA:

Should also be PID 11711 for full-frame imaging I think, though hard to check with APT down.

stscijgbot-jp commented 1 month ago

Comment by Jane Morrison on JIRA:

David Law  I used the ApertureStats routine as Ian Wong  suggested and it fixes the nan problem. For the non sub64p_wlp8 data.

I have not worked with TSO data much. So I am not sure if I need to add a fix in sub64p_wlp8 data. I replace the np.sum with np.nansum below. Is this desired or should I remove it ?

 

 if sub64p_wlp8:         info = ('Photometry measured as the sum of all values in the '                 'subarray.  No background subtraction was performed.')

        for i in np.arange(nimg):             aperture_sum.append(np.nansum(datamodel.data[i, :, :]))             aperture_sum_err.append(                 np.sqrt(np.nansum(datamodel.err[i, :, :]**2)))

stscijgbot-jp commented 1 month ago

Comment by David Law on JIRA:

Jane Morrison Based on further conversations about this, I think we probably want to use the new nansum and AperStats approach that you've implemented for the background annulus region (as a small number of NaNs should have very small effect on the final value), and keep the returned-NaN value if the offending pixel falls within the extraction aperture itself (where the impact on the final photometry could be quite large).  Thoughts Ian Wong ?

stscijgbot-jp commented 1 month ago

Comment by Ian Wong on JIRA:

Yes, the nansum should be the right approach here to avoid NaNs in the outputs. 

stscijgbot-jp commented 1 month ago

Comment by David Law on JIRA:

Ian Wong Right, but mask NaNs in background regions and allow them to NaN the final result in science apertures?  I'm trying to think what would be least confusing from a TSO user point of view if (say) the NaN pixel accounted for 5% of the total light in a given integration.

stscijgbot-jp commented 1 month ago

Comment by Ian Wong on JIRA:

I think it would be best to retrieve as many good photometry points as possible. So I'd make sure NaNs are masked in both the background and science apertures and use nansum for both. That way, the science flux and (science flux-bkg) are finite values (as much as possible). For bookkeeping, perhaps there can be a way to add a DQ flag that indicates that there were NaNs within the science aperture (background aperture NaNs shouldn't matter much), since they will affect the integrated flux.

 

stscijgbot-jp commented 1 month ago

Comment by David Law on JIRA:

Ok- summarizing for future reference.  Regular imaging source catalog photometry will return NaN if there is a NaN value in the science aperture, since a common failure case will be when the bright center of a star is entirely NaN.  TSO photometry however will use nansum for the aperture since it is more likely to be affected by random pixels bad in a single-integration (and with the focus on a single source more user attention will be drawn to any outlying points).

stscijgbot-jp commented 1 month ago

Comment by Melanie Clarke on JIRA:

Fixed by #8672