spacetelescope / jwst

Python library for science observations from the James Webb Space Telescope
https://jwst-pipeline.readthedocs.io/en/latest/
Other
571 stars 167 forks source link

NIRCam ERR measurements (from photutils) of same source are inconsistent depending on the pixel scale of the mosaic #8625

Open stscijgbot-jp opened 5 months ago

stscijgbot-jp commented 5 months ago

Issue JP-3680 was created on JIRA by Maria Pena-Guerrero:

A user reported via Help Desk ticket, that NIRCam ERR measurements of the same source are inconsistent depending on the pixel scale of the mosaic. This was confirmed by the NIRCam team and they suspect a bug in the resample step.

The notebook the user provided can be found in the Help desk ticket: https://stsci.service-now.com/nav_to.do?uri=%2Fincident.do%3Fsys_id%3D9b3c5903934b8e50619dfe4c5cba103f%26sysparm_domain%3Dnull%26sysparm_domain_scope%3Dnull%26sysparm_view%3DJWST%26sysparm_view_forced%3Dtrue

 

stscijgbot-jp commented 5 months ago

Comment by Bryan Hilbert on JIRA:

The values in the SCI extension of the mosaics were consistent across changing output pixel scale values. It is the values in the ERR array that vary with the output pixel scale.

stscijgbot-jp commented 5 months ago

Comment by David Law on JIRA:

Maria Pena-Guerrero : Just to confirm, this bug is present in the B11.0 (jwst 1.15.0) code, right?

stscijgbot-jp commented 5 months ago

Comment by Maria Pena-Guerrero on JIRA:

correct, present in B11.0

stscijgbot-jp commented 5 months ago

Comment by Bryan Hilbert on JIRA:

Anton Koekemoer has had a chance to take a more detailed look at this issue. He reports:

"I still confirm the 2x inconsistency in errors, I also confirm that our test from the other day reproduces the photutils approach -- but photutils may be part of the issue, here’s why:

So perhaps this is not a pipeline bug at all, but a photutils bug? Larry Bradley

stscijgbot-jp commented 5 months ago

Comment by Anton Koekemoer on JIRA:

To expand a bit on the above -- it all works fine, ie the pipeline ERR array and also the photutils measurements, when the mosaics are at the default pixel scale, so no issues there at all (ie this doesn't impact any of our default pipeline products).

But with a non-default pixel scale, the way in which the ERR array is defined (MJy/sr) means the interaction with photutils needs to be slightly adjusted to account for the relative change in pixel scale.

Attached here is the plot from the original helpdesk call (all public data), showing the 2x difference in the error values -- the photometry flux values are all consistent, just the error values are off by 2x, or by sqrt(relative pixel area)

!image-2024-07-05-11-53-01-534.png!

stscijgbot-jp commented 4 months ago

Comment by Larry Bradley on JIRA:

The very first thing the JWSTSourceCatalog step does when creating the source catalog is to convert the data and error array from MJy/sr to Jy. All of the photutils photometry calculations are done from data/error arrays in flux (Jy) units, so it's not a photutils bug.

The unit conversion from MJy/sr to Jy is performed in the source catalog step using the model.meta.photometry.pixelarea_steradians value of the input image. -Does the resample step update this value for non-native pixel scales? That could be the source of the bug.- Edit: that's not the issue. The issue is the input error array (see comment above).

stscijgbot-jp commented 4 months ago

Comment by Larry Bradley on JIRA:

I took a look at the images provided in the help desk ticket. The issue is definitely with the input data. The error arrays for 30 mas/pix and 60 mas/pix data are not equivalent. You can verify this by taking a single pixel in the 60 mas/pix image and comparing it with its corresponding 4 pixel (2x2 region) in the 30 mas/pix.

First convert the data and error arrays from surface brightness to flux density units (e.g., multiply by PIXAR_SR * 1e12 to convert to uJy).

Flux: Comparing the 60 mas/pix data value with the sum of the corresponding 4 pixels in 30 mas/pix data images agree.

Errors: The square root of the quadrature sum of the 4 pixels in the 30 mas/pix image is half the value of the same pixel in the 60mas/pix error image. They should give the same value. Therefore, one of the error arrays in the images provided by the user is wrong (scaled improperly).

stscijgbot-jp commented 4 months ago

Comment by Bryan Hilbert on JIRA:

Hmm. I'll ask the user for more details on how she created the image and error arrays. I've been assuming that they are directly from the stage 3 i2d files, but maybe she applied some additional scaling.

It looks to me like she's scaling everything correctly in her notebook, right?

#convert image and error arrays from MJy/sr to muJy pixel_area = hdr["PIXAR_SR"] #pixel area in steradian img_scaled = img * pixel_area * 1e12 err_scaled = err * pixel_area*1e12

ap_radius_pix = ap_radius / np.sqrt(hdr['PIXAR_A2']) aper = CircularAperture(centroid,ap_radius_pix)

#perform photometry phot_table=aperture_photometry(img_scaled,aper,error=err_scaled) flux = phot_table['aperture_sum'].value[0] flux_err = phot_table['aperture_sum_err'].value[0]

stscijgbot-jp commented 4 months ago

Comment by Anton Koekemoer on JIRA:

Thanks Larry, for looking into this. These images are just from i2d files produced by resample, with the pixel values for Flux density and Err being in units of surface brightness, MJy/sr – the question really is more about how to interpret these surface brightness units for the Err array when calculating the photometric uncertainty, as follows.

Eg in the 60mas mosaic, let's consider one pixel with Flux density = 0.2 MJy/sr and Err = 0.01 MJy/sr (just using rounded-off values here, for the sake of discussion, similar to the values in these images), just to have something concrete.

Then in the 30mas mosaic, if that 60mas pixel corresponds exactly to 4 output 30mas pixels, each of these 4 pixels in the 30mas mosaic would still have the same values, ie Flux density = 0.2 MJy/sr and Err = 0.01 MJy/sr – the surface brightness values remain invariant, regardless of the scale of the output pixels with which they are being sampled.

Now, for Flux density, as you describe, the summation works fine, ie total Flux density = sum of pixel values times the pixel area (regardless of mosaic pixel scale).

But for Err, this is basically what's at the heart of the issue -- as you describe, the photometric uncertainty is calculated by summing the Err values in quadrature, so it will be off by a factor sqrt(relative pixel area), or equivalently by sqrt(relative number of pixels), and this factor is what needs to be applied when calculating the total photometric uncertainty.

More specifically, for calculating photometric uncertainty when doing the quadrature sum for Err, it needs to be multiplied by a factor of 1/sqrt(output pixel area / native detector area), or equivalently multiplied by a factor of 1/pixel_scale_ratio. So in this case, actually the photometric uncertainty for all four mosaics would need to be adjusted (approximating here the native NIRCam pixel scale as 31.5mas for the F200W SW data, and 63mas for the F277W LW data) – the photometric uncertainties would need to be multiplied by the following factors in each case:

So multiplying the photometric uncertiainties by the above factors would bring them all into line (ie F200W 60mas uncertainty is reduced by 2x relative to F200W 30mas, and F277W 30mas uncertainty is increased by 2x relative to F200W 60mas).

All this is essentially just a consequence of the fact that, while Flux density and Err are both in surface brightness units, the summation is done in quadrature for photometric uncertainty.

Hope this helps!

stscijgbot-jp commented 4 months ago

Comment by Anton Koekemoer on JIRA:

thanks also Bryan for posting some more details -- since the user is already applying scaling, then one simple resolution to all this could just be to inform the user of the need to apply this additional scaling on the resulting photometric uncertainties (for all the mosaics).

Larry Bradley  I then have a question for you about photutils, since it seemed the user is already applying scaling – can photutils in fact be run directly on an i2d mosaic as-is (without externally scaling the images first), ie giving it an i2d mosaic with the pixel values still in surface brightness units, and have it directly calculate the corresponding photometry and uncertainties?

If that's not yet the case, then perhaps this could be a new capability to consider adding for photutils? – ie, the ability to run photutils directly on i2d.fits mosaics (still in surface brightness units) without needing to apply any scaling beforehand. In this mode, photuils could look at the datamodel to recognize that the images are in surface brightness units, and adjust its calculations accordingly.

If that could be done, then all the above scaling discussion could be incorporated into photutils, so that it seamlessly runs directly on an i2d.fits mosaic of arbitrary output pixel scale and produces photometry and uncertainties, without needing to apply any scaling factors before providing the mosaics to it – could this be an option for future work on photutils perhaps?

stscijgbot-jp commented 4 months ago

Comment by Larry Bradley on JIRA:

Bryan Hilbert The scaling she applied in the notebook is correct. The scaling she applied is to convert the data and error arrays from SB (MJy/sr) to flux density (uJy). That is the correct thing to do as photutils clearly states in the documentation that conversion of data from SB units should be performed by the user before calling photutils, e.g., aperture_photomety: "Note that this function returns the sum of the (weighted) input data values within the aperture. It does not convert data in surface brightness units to flux or counts. Conversion from surface-brightness units should be performed before using this function." That is also exactly what the JWST source catalog pipeline step does.

To be absolutely clear, the issue here is in the error arrays of her input images – one of them simply is not correct. If those error arrays were correct, then her notebook would give the ~same photometric errors. If her error arrays came from the pipeline, then there's a bug in resample step error calculation.

Anton Koekemoer While she could apply an additional scale factor to fix one of the error arrays (note either one could be wrong; I don't know which), that would just be a hacky workaround for the fact that one of these error arrays is wrong by a factor of 2 (or 0.5). They do not agree with each other.

Here's an example script using simple error propagation (sqrt of quadrature sum) to show that:

#!/usr/bin/env python

import numpy as np
from astropy.io import fits
from astropy.wcs import WCS

def get_data(filename):
    data = {}
    with fits.open(filename) as hdu:
        hdr = hdu[0].header
        data['hdr'] = hdr
        data['wcs'] = WCS(hdr)
        conv = hdr['PIXAR_SR'] * 1.e15  # MJy/sr -> nJy
        data['sci'] = hdu[0].data * conv
        data['err'] = hdu[1].data * conv
    return data

def compare_images(data30, data60, xc60, yc60):
    sc = data60['wcs'].pixel_to_world(xc60, yc60)
    xc30, yc30 = data30['wcs'].world_to_pixel(sc)
    yc30a = np.round((yc30 - 0.5, yc30 + 1.5)).astype(int)
    xc30a = np.round((xc30 - 0.5, xc30 + 1.5)).astype(int)

    # compute the flux of the identical region in both images
    flux60 = data60['sci'][yc60, xc60]
    flux30 = np.sum(data30['sci'][yc30a[0]:yc30a[1], xc30a[0]:xc30a[1]])
    print(f'Flux 60mas: {flux60:.3f} nJy')
    print(f'Flux 30mas: {flux30:.3f} nJy')
    print(f'Ratio: {flux60 / flux30:.3f}')

    # compute the flux error of the identical region in both images
    error60 = data60['err'][yc60, xc60]
    error30 = np.sqrt(np.sum(data30['err'][yc30a[0]:yc30a[1],
                                           xc30a[0]:xc30a[1]] ** 2))              
    print(f'\nError 60mas: {error60:.3f} nJy')
    print(f'Error 30mas: {error30:.3f} nJy')
    print(f'Ratio: {error60 / error30:.3f}\n')
    

fn30 = 'f200w_30mas_cutout.fits'
fn60 = 'f200w_60mas_cutout.fits'
data30 = get_data(fn30)
data60 = get_data(fn60)
# pick any non-zero pixel in the 60mas image
xc60, yc60 = 1607, 1470
compare_images(data30, data60, xc60, yc60)

 

The result is:

Flux 60mas: 43.993 nJy
Flux 30mas: 44.060 nJy
Ratio: 0.998

Error 60mas: 1.437 nJy
Error 30mas: 0.717 nJy
Ratio: 2.003

If both the input error arrays were correct, then the error ratio should be 1.  One of them is wrong.

Anton Koekemoer It has been requested before to support SB units directly in photutils ([https://github.com/astropy/photutils/issues/1403).]  Unfortunately, it's not so straightforward to do this in a general manner.  In any case, photutils would simply multiply the input data by the pixel area at the very beginning, so instead of adding new keywords to a bunch of functions in photutils to allow the user input a pixel area (or pixel area map), a more sensible approach is to require the user to do that themselves.  This is clearly stated in the documentation, e.g., https://photutils.readthedocs.io/en/latest/api/photutils.aperture.aperture_photometry.html#photutils.aperture.aperture_photometry].

stscijgbot-jp commented 4 months ago

Comment by Anton Koekemoer on JIRA:

Larry Bradley thanks for your reply, also thanks for confirming (and clarifying) that photutils expects data in units of flux density and not surface brightness (I'd somehow initially had the impression that it could, but it's now abundantly clear this is not the case).

But to clarify further, there is also not a problem with the Err arrays in the user's data, which come directly from the i2d output files produced by resample from stage 3. Let me try to add to what I described above – the resample step preserves surface brightness, by design, so the pixel values of the Err array are preserved (starting from the Stage 2 products), in the same way as the Flux density pixel values, regardless of the size of the output pixel with which they are being sampled.

And this basically is the issue, namely that Err arrays are defined in units of MJy/sr (in Stage 2 products already, before we get to Stage 3), and these values are preserved by Stage 3, regardless of the output pixel scale with which they are sampled. These definitions were evidently decided at the dawn of time with the pipeline's inception, and handed down to us. The mathematical consequence of this is the need to apply an additional scaling by the pixel scale ratio, as shown by your code snippet, also as shown by the user's script in the original helpdesk call, also by my notes above.

(as an exercise, if we take a pixel in a stage 2 output image, which has some Err value in units of MJy/sr, and multiply by the pixel area to convert that to MJy, then do the quadrature sum over a full steradian assuming uniform error values over this region, the resulting value will differ by sqrt(N) pixels, with N being the large number of pixels covering a full steradian. It's essentially just the mathematical consequence of how the units of the Err values are defined at stage 2).

 

So to summarize, let's agree there's no issue with photutils since it doesn't expect the data to be in surface brightness units (which I was initially uncertain about but you've now clarified for me, thanks); and let's also agree there's no issue with the Err arrays, given how they are defined, and given how their values are propagated. Instead, the need for this additional scaling is a direct consequence of preserving the surface brightness pixel values of the Err arrays starting from the Stage 2 products.

The solution, then, is for this additional scaling by the pixel scale ratio to be applied by the user, in order for the photometric uncertainties to be consistent, and in this particular example, four different scaling factors would be needed when calculating each of the photometric uncertainties, as I summarized above.

So the above would resolve this issue, and we should hopefully be able to get back to the user with this recommendation.

stscijgbot-jp commented 4 months ago

Comment by David Law on JIRA:

Anton Koekemoer I've been loosely following this, let me see if I understand your summary above.

Resampling takes input fluxes and errors from some pixel scale P1 and maps them onto a different grid with pixel scale P2.  Likewise, it could instead map them onto pixel scale P3 instead.  If we were looking at a perfectly uniform source with intrinsic surface brightness of 1 MJy/sr (and measurement  uncertainty 0.1 MJy/sr in each native pixel), then regardless of what P2 or P3 we mapped to we'd still have surface brightness values of 1 MJy/sr with uncertainty 0.1 MJy/sr.  Is that a correct description of what's going on now?

If so that seems undesirable; the resampled ERR should be lower in the case where P3 is larger than P2, because it's averaged over more input pixels.  To my mind this shouldn't require special attention from the user to use our recommended tools to do photometry on our pipeline data products.  Am I thinking about this accurately?

stscijgbot-jp commented 4 months ago

Comment by Anton Koekemoer on JIRA:

David Law thanks very much for following up. The above captures it, and is the consequence of the original JWST design choices of: (1) the units of MJy/sr, already present in the stage 2 products, and (2) the original design choice of resample to preserve surface brightness, for all the arrays. So when resample is run, it preserves the pixel values of both Flux density and Err, by definition (ie it doesn't scale these by pixel size); the output pixels are effectively providing sampling points of these values.

If our units were not surface brightness but instead just, eg, MJy (or any other non-surface brightness unit, as with all the HST instruments) then a scaling based on pixel size is exactly what happens, as you describe, ie the values of flux, error and variance in a given pixel are adjusted for pixel scale -- all the sums then also work (including those in quadrature) and no additional scaling is require by the user.

In this case however, when the user wants to run photutils, they're required to carry out their own scaling in any case, simply to convert their surface brightness units to MJy. Therefore, since they're already having to carry out their own scaling, then they should simply be adding this additional scaling factor for the Err arrays, and the readthedocs documentation should describe the need to do this (along with the mathematical foundation for why it's necessary).

An alternative solution would be for photutils to be able to recognize that the images are in MJy/sr units and carry out this conversion internally (which is actually the impression I had when we first thought that's where the issue might be, which I now realize is not the case), but this would be considered as additional development for photutils.

stscijgbot-jp commented 4 months ago

Comment by Larry Bradley on JIRA:

David Law I completely agree with you that our data products should not require special attention from the user (or from any photometry code).

Let me reiterate once again:

stscijgbot-jp commented 4 months ago

Comment by Anton Koekemoer on JIRA:

Larry Bradley thanks for your reply.

The error arrays in these images are (1) consistent with their definition in MJy/sr and (2) consistent with the definition of the behaviour of resample to preserve surface brightness. These decisions come from the pipeline's inception and were handed down to us. The images from the user in this helpdesk call were produced by the pipeline, as shown in their headers.

The different values is a mathematical consequence of the definition of error arrays in MJy/sr, starting with the stage 2 products -- as an exercise, take any pixel which has some Err value in MJy/sr, then multiply by pixel area to convert to MJy, then do a quadrature sum over the N pixels that would cover a full steradian (assuming they all have uniform values): the resulting error value in MJy for this steradian will be different from the original MJy/sr pixel value by a factor of sqrt(N). There's no way to avoid this consequence of the definition of Err in MJy/sr. Again, this definition was decided early on with the pipeline's inception and handed down to us.

If we'd like users not to have to apply any scaling factors to our pipeline image products in order to use photutils (since right now they still need to scale by pixel area anyway irrespective of the error array discussion), and if photutils won't change, then one outcome from this would be for the pipeline image products to no longer be in MJy/sr but instead, eg MJy or some other multiple such as nJy (which would then behave in the same way as HST images, ie can be given directly as input to photometry scripts –  I'm not taking a position here on this either way, but pointing it out as an outcome from the desire to run photutils on pipeline products without users needing to modify the products).

stscijgbot-jp commented 4 months ago

Comment by Anton Koekemoer on JIRA:

David Law  and Larry Bradley, I have a proposed path forward which should address all our concerns in the above discussion:

This will provide all the following benefits:

So, to summarize, the proposed approach simply involves adding a parameter "output_units" to the resample step, where the choices are "MJy/sr" which will continue to be used in the operational pipeline, or "nJy" (my suggested preference) which we would recommend users to set when running mosaics offline. I've also changed this ticket type from Bug to Improvement, to help with how we think about this, gong forward.

stscijgbot-jp commented 4 months ago

Comment by Larry Bradley on JIRA:

Anton Koekemoer Your reasoning appears to be based on the faulty assumption that (standard deviation) errors should be conserved in MJy/sr units during the drizzle step.  That is not correct.  Variance is the correct property to conserve.  The user's error arrays are simply not mathematically equivalent.  There is nothing magical about SB units that makes them correct.  The pipeline assumptions are not immutable and we are not powerless to change them.  They produce incorrect error arrays and we need to fix them.  Anything else is against the Principle of Least Astonishment for our users.  They do not care what assumptions were made in creating them.  They just want correct products.

stscijgbot-jp commented 4 months ago

Comment by Anton Koekemoer on JIRA:

Larry Bradley it's not a 'faulty assumption' but instead a consequence of: (1) the defined units of Err in MJy/sr and (2) the defined behaviour of resample to preserve surface brightness. This leads to the mathematical consequence described above.

If we produce images with differing Err pixel values according to the output pixel scale, while still insisting on presenting them in units of "MJy/sr", this becomes inconsistent.

It is all solved by enabling the outputs to be in MJy (or nJy), and I don't see why there is an objection to allowing this additional capability for the users when running mosaics offline.

I would like for David Law  to have a chance to weigh in on my proposed approach above.

stscijgbot-jp commented 4 months ago

Comment by David Law on JIRA:

I've had a chance to dig into this a bit now.

 I can confirm that if you take a NIRCam image of a perfectly uniform, infinitely extended source in which every pixel in the CAL file has flux=1 MJy/sr and err=0.1 MJy/sr, the drizzled data cubes also have flux=1 MJy/sr and err=0.1 MJy/sr regardless of what pixel size is chosen (using either exptime or ivm weighting).

Anton Koekemoer I think that's inconsistent with the original intended goal to preserve surface brightness.  In the limit that you drizzled to a single pixel as large as the entire NIRCam FOV, the uncertainty in the surface brightness at the location of that giant pixel is far less than in each individual regular pixel measurement.  Indeed, this is what we see in the 3d drizzle implementation- surface brightness remains constant with changing output pixel size, but the corresponding error in that surface brightness changes according to standard variance propagation.

In detail of course it's a little more complicated.  E.g., when drizzling to arbitrarily small pixels the variance probably doesn't really scale with the square root of the pixel area ratio, but this is an ill-defined regime factors of many below the native pixel scale.  Likewise, the covariance properties will be different for different output pixel scales, which will also have an impact on the errors in extracted photometry that doesn't account for covariance effects.

I think we'll need to dig into the resampling code to figure out the best way to address this though, after which we can look at any downstream implications.

stscijgbot-jp commented 4 months ago

Comment by Anton Koekemoer on JIRA:

David Law thanks very much for following up. Indeed, I might say that in the end we're all effectively arriving at the same place, but starting from different perspectives.

If the final outcome from this will be to adjust the Err array pixel values (in MJy/sr) by an additional amount corresponding to the pixel scale, then it needs to be clearly described as a change in design choice for the interpretation of the Err values (and not as a bug, since it is not) -- the interpretation is changing from the original one, where Flux density and Err (in units of MJy/sr) have pixel values that are invariant of the scale of the pixels with which they're sampled, to one where the Err (and Var) arrays will have pixel values that depend on the scale. This is fundamentally a change in design choice for the interpretation of these units, motivated by the way in which the pixel values are subsequently used.

I would still like to continue to advocate for the idea of adding a parameter "output_units" to the resample step, allowing the user to choose nJy (or MJy) as a unit when constructing mosaics, in which case the above behaviour of flux density, error and variance values as a function of pixel scale will occur naturally and will be fully consistent with their units. But perhaps this is better decoupled as a separate future enhancement.

stscijgbot-jp commented 4 months ago

Comment by David Law on JIRA:

Just chiming in to agree with Anton Koekemoer that we can think of this as a conceptual change in how we're choosing to understand what 'surface-brightness based uncertainties' means (which is a somewhat complex concept), rather than a simple mistake in the original implementation.

It's also worth noting that this is a sufficiently interesting topic that I'd like to get a bit more input through the JP and CAL frameworks before we move forward with this, although we can certainly look into how such a change could best be implemented and what some of the downstream implications might be.

Agreed that the suggested option for resampling to flux rather than surface brightness units is a separable topic that can be considered separately in future.

stscijgbot-jp commented 4 months ago

Comment by Anton Koekemoer on JIRA:

Thanks David Law  for your reply, and for charting a path forward.

Also, for Bryan Hilbert  to be able to get back to the user who originally filed the helpdesk ticket, he could explain that this is essentially a consequence of the conceptual definition of 'surface-brightness based uncertainties' which have somewhat complex properties, and that some changes are being explored for possible incorporation into the software in future, but in the meantime the scaling factors derived by the user can be applied to the uncertainties (as described by the user in more detail in the original ticket).

stscijgbot-jp commented 4 months ago

Comment by David Law on JIRA:

Melanie Clarke Flagging this so that you're aware of it.

stscijgbot-jp commented 4 months ago

Comment by Melanie Clarke on JIRA:

Thanks David Law, I've been following the conversation.  It sounds like there's no change recommended for now, but I think this may be another case where it would be useful to revise the core drizzle code to propagate variance arrays alongside science data, as we discussed on JP-3569.  Is there a ticket for that?

stscijgbot-jp commented 4 months ago

Comment by David Law on JIRA:

Melanie Clarke Right, no action for now as we'll need to discuss the path forward more broadly at a JP meeting first.  Since it may relate to JP-3569 and JP-3547 though I wanted to give you a heads up.

stscijgbot-jp commented 2 months ago

Comment by David Law on JIRA:

Adding a note for completeness that this ticket is currently blocked by https://jira.stsci.edu/browse/JP-3757