spacetelescope / jwst

Python library for science observations from the James Webb Space Telescope
https://jwst-pipeline.readthedocs.io/en/latest/
Other
559 stars 165 forks source link

Signal values are systematically lower in a combined i2d image than in the individual input i2d images. #5131

Closed stscijgbot-jp closed 1 year ago

stscijgbot-jp commented 4 years ago

Issue JP-1570 was created on JIRA by Kevin Volk:

The simulated observation is a four-point dithered exposure of a star field in the F158M NIRISS filter.  There are 10 bright stars in the field with the same brightness, all well separated from each other.  I have done some work in the Mirage simulation to make the 10 stars all of the same brightness.  This required overriding the gain map in Mirage because that imposes structure on the image that is not removed by the pipeline flat fielding step.

Comparison of the source catalogue values when the step is applied to the individual images as opposed to the final combined image showed that the point source brightness values were systematically lower in the combined image, by a mean value of about 0.3 magnitudes.  I then went to the individual images and located the point sources with photutils DAOFind and then did box photometry on each source using a 21x21 pixel box for the source and a larger 41x41 pixel area surrounding the box to subtract the background signal.  This was done in the source_catalogue.py code.  There are small aperture corrections with this size of a box, but these should be similar from image to image and are in any case smaller than the discrepancies that I am seeing.

Once I assembled the signal values measured from source_catalogue.py into file values.txt I then ran signal1.py to get the output mean signal ratios from the combined i2d image as compared to the individual i2d images.  The values that are found are (columns are the source number, the mean ratio, and the standard deviation of the ratios in the four input dithered i2d files)

1 0.732083 0.002836 2 0.781537 0.003484 3 0.779498 0.007277 4 0.734638 0.004164 5 0.759277 0.002933 6 0.863753 0.005906 7 0.728621 0.004699 8 0.728476 0.004859 9 0.732828 0.002886 10 0.719227 0.004364

Ratio mean = 0.755994 Ratio sigma = 0.041534

The ratio values vary somewhat so it does not appear to be some constant scaling factor that is involved.  The differences are factors of 10 or more larger than the variations seen from image to image in the input i2d files.

Back comparison of the output magnitudes from the source catalogue to the Mirage input catalogue when calculated in this way shows that the variation seen from source to source is larger after the resampling step than in the Mirage scene images or in the output rate images from the detector1 pipeline.  The magnitude of the variations goes from a range of 0.024 magnitudes to a range of about 0.16 magnitudes–or from a range of 2.2% to a range of 16%.  The input source magnitudes to Mirage are identical.  It is not clear why the resample step in image2 causes this change, but that is separate from the comparison of the combined resampled image to the individual i2d images that are produced in the image 2 phase.

I am wondering whether the factor of near 3/4 is due to something due to combining 4 images with say one image rejected per pixel in the combination.  However, the input images are constructed with no bad pixels so one should not have any such rejection.  Also I would have thought that the surface brightness units would not be affected in this way.

 

stscijgbot-jp commented 1 year ago

Comment by Kevin Volk on JIRA:

I made measurements using photutils DAOFind/IRAFStarFind and these show the same type of change in the signal between the combined resampled image and the individual resampled images (I am subtracting the median from the images before making the measurements to – one hopes – remove any background signal).  Nonetheless, I am wondering whether these measurements are being affected by the background offset to negative values that is seen in the combined images.  It is possible that once that issue is resolved (ticket JP-1416) this one might go away as well.  One would think that either the box photometry or these measurements in photutils will not be affected by a global constant offset of the image.  But it looks like the background signal is approximately equal to the "missing" signal measured by the different methods.

stscijgbot-jp commented 1 year ago

Comment by Kevin Volk on JIRA:

The same type of issue is seen in pipeline version 7.6 where the background offset issue is resolved.  The offset does seem to change from one filter to another.

stscijgbot-jp commented 1 year ago

Comment by Alicia Canipe on JIRA:

Hi Kevin Volk, just triaging open issues to make sure we know what to focus on in the next builds. Looks like JP-1416 was fixed. When you have time, do you want to retest to see if that resolves the issue? If it doesn't, we can elevate the priority on this so it can be added to a sprint. 

stscijgbot-jp commented 1 year ago

Comment by Kevin Volk on JIRA:

Hi Alicia, when I tested this recently the issue was still there.  It is not a problem with the background.  If one just looks at the background pixels they have the proper surface brightness after resampling.  Only the discrete point sources show the discrepancy.  You can see the issue is still present in the plot I attached to JP-2111.

 

 

 

stscijgbot-jp commented 1 year ago

Comment by Kevin Volk on JIRA:

Testing in pipeline version 7.8rc4 shows this issue is still present, and if anything is worse than it was before.  I attach an image that shows a point source cut-out in each of four dither positions in the individual resampled images (top two rows) and the point source in the combined resampled image point source (lower left corner) on a linear display scale.  The combined point source has a lower signal than the individual input dither position signals by a factor of more than 3.  The cut-outs are shown also in logarithmic scaling to show the details in the combined PSF.  All 10 sources in the simulation show the same type of issue in the PSF core of the combined resampled image.

 

!cutout_source1.png!

!cutout_source1_log.png!

stscijgbot-jp commented 1 year ago

Comment by James Davies [X] on JIRA:

It looks like outlier_detection is flagging the core of this point source. You can tell for sure by looking at the WHT and CON extensions. If you do the same measurement above but running the pipeline with outlier_detection turned off, what's the result? Presumably the core won't be rejected.

And now that you have a baseline, what changes to outlier_detection parameters need to be made to make it less aggressive toward point source cores. Also, it looks like this PSF is pretty undersampled, exacerbating this well-known problem for this type of outlier detection.

stscijgbot-jp commented 1 year ago

Comment by Kevin Volk on JIRA:

I will check this.  With only 4 input images I had not expected that the outlier detection would flag any pixels.

stscijgbot-jp commented 1 year ago

Comment by Mattia Libralato on JIRA:

Hi Kevin Volk , I noticed this ticket and I have a few comments/questions:

I did this same test with MIRI (F560W and F770W filters). 4 images per filter, many stars and galaxies. I compared photometry of each stage-2 image, each individual stage-3 image (made by running the pipeline multiple times) and the combined stage-3 image. I found a mag difference of 0.01-0.02 mag between the stage-2 image and the corresponding stage-3 image, but basically no difference (<0.01 mag) between individual stage-3 images and combined stage-3 image. I also compared the pipeline source catalogs in the latter case and found the same result.

stscijgbot-jp commented 1 year ago

Comment by Kevin Volk on JIRA:

There are no saturated pixels in these example imaging simulations.  The background is simulated as being uniform and I use a background aperture to remove it.  In any case, one can see the difference in signal using IRAF imexam to fit profiles to the CAL and I2D images so the issue is not my measuring code.  The issue seems specific to NIRISS although I have only guesses as to what the cause may be.

stscijgbot-jp commented 1 year ago

Comment by Mattia Libralato on JIRA:

Got it!

stscijgbot-jp commented 1 year ago

Comment by Stephanie La Massa on JIRA:

I'm resurrecting this ticket as we've done some testing on this recently, following Kevin's methods above (simulation with 10 sources of equal brightness, using the same aperture photometry on images produced by the pipeline to extract signal levels and calculate magnitudes.)

I did a test where I turned off the outlier_detection step of the pipeline and compared the signal levels and magnitudes of those sources with those where the outlier_detection step was turned on. Most sources do show a discrepancy at the 0.1 - 0.3 magnitude level, but a handful show no differences. For sure, there is work we should do to figure out how to optimize the outlier_detection step for NIRISS.

However, the larger cause of disagreement between the input image and output resampled image is coming from the resample step of the pipeline. When I compare the photometry of sources from the rate image (which is close to the input Mirage seed image, though not perfect there), the agreement is better between the outlier-off image than the outlier-on image. But the magnitudes of sources in the outlier-off image are offset by an almost constant amount of 0.7 compared with the sources in the rate image.

So while the outlier detection step is contributing to the discrepancy between the input and output magnitudes, it's not driving the discrepancy. Our testing so far has pointed to the resample step has having the largest effect.

I'm attaching our testing notebook for reference.

Edit: Jira won't allow me to upload my notebook, so here's the path to it:

/ifs/jwst/wit/niriss/lamassa/pipeline_test/jwst_validation_notebooks/jwst_validation_notebooks/image3/jwst_image3_niriss_test/jwst_image3_outlier.ipynb

stscijgbot-jp commented 1 year ago

Comment by Stephanie La Massa on JIRA:

I re-ran this notebook using v.1.3.3 of the pipeline and got the same results as reported above. The path to the testing notebook is the same.

stscijgbot-jp commented 1 year ago

Comment by Mihai Cara on JIRA:

I tried looking at the notebooks but I have trouble understanding the issue and the test.

First, the title of this issue seems to suggest NIRISS images but when I looked at the notebook, it seems to me most of the notebook deals with MIRI images. Should I be using MIRI images?

Second, I do not have access to /ifs/* and /user/volk/simple_imaging_example/ seem to no longer exist. Luckily I got a copy from September 2021 (don't remember why) but maybe the notebook was modified after that? This is true for Stephanie La Massa notebook too.

Third, when you say that "The combined point source has a lower signal than the individual input dither position signals by a factor of more than 3" do you refer to the central (brightest) pixel intensity, flux in a large aperture (with or without background subtraction)?

Finally, would it be possible to produce a very simple simulation without pointing errors and no background and no noise? If you do not add background you do not need to subtract it from apertures using outer rings around sources. Let's simplify the simulation as much as possible, please.

stscijgbot-jp commented 1 year ago

Comment by Kevin Volk on JIRA:

My notebook at least has not been modified for a long time.  I think Stephanie's one likewise has not been changed for a while.  If you are looking at MIRI simulations and results, then you are not looking at our notebooks.  I have never made a MIRI simulation or looked at any MIRI reductions in the pipeline.

All my measurements are using a fairly large square aperture for the source and a larger square cut-out for the background.  Since this is done uniformly in the different images (the distortion effects for NIRISS are rather small so that will not produce much change) I do not see how that could be an issue.

Making a no noise simulation is not a trivial undertaking, since Mirage does not do that.  I guess that I could paste the infinite S/N scene images from Mirage into the _rate files and run from there, with some work to deal with signals above the practical limit for NIRISS and maybe other issues.  This will take some time.

stscijgbot-jp commented 1 year ago

Comment by Mihai Cara on JIRA:

The only notebooks are: "nis-src-find.ipynb" and "daofind-issue.ipynb". Could you please, at least restore the data in the /user/... so that I could re-download it?

I do not think you should spend time on infinite S/N in Mirage ... I think I can just simulate a sharp PSF directly in a zero-background 'cal' image. For the purpose of confirming that there is an issue with drizzle-combined fluxes versus single-resampled images that should be enough. But before I go there, I would like to be able to reproduce your issue using exactly the same settings. So your notebook and configuration files would be helpful.

stscijgbot-jp commented 1 year ago

Comment by Stephanie La Massa on JIRA:

Hi Mihai Cara & Kevin Volk: I have a simplified version of Kevin's notebook that could be a helpful starting point for the troubleshooting you're thinking about. Would it be OK if I posted links and some details later tonight or by early tomorrow morning? (I'm tied up for the next couple of hours, but figured I'd chime in now to save Kevin some time in restoring data/tests, etc.)

stscijgbot-jp commented 1 year ago

Comment by Mihai Cara on JIRA:

Stephanie La Massa Yes, of course. Thank you!

stscijgbot-jp commented 1 year ago

Comment by Stephanie La Massa on JIRA:

I'm re-familiarizing myself with the Jupyter notebook I made (a modification of Kevin's notebook), which can be found here:

http://localhost:8888/notebooks/pipeline_test/jwst_validation_notebooks/jwst_validation_notebooks/image3/jwst_image3_niriss_test/jwst_image3_outlier.ipynb

The simulation is a simple one: an imaging scene with the F158M filter that has 10 sources of equal magnitude positioned across the detector. There are 10 groups, 1 integration, and a 4-point dither pattern using the NISRAPID readout. The simulated dataset does not have bad pixels, so that should simplify photometry testing.

The 4th cell in the Jupyter notebook has the paths to the simulations and reference files that were used to generate the simulations. Hopefully those should be accessible to you, but if not, please let us know!

The notebook runs the simulated uncal.fits data files through the pipeline. I process the data with both the outlier step turned on, and with it turned off, in stage 3 of the pipeline. But in either case, the signal issue reported in this ticket persists: when comparing the magnitude of the sources in the rate image (`_rate.fits) and the dither-combined image  (**_i2d.fits`), there is an offset with the magnitudes in the dither-combined image being fainter (last 2 cells of the notebook, where I compare the source list from a simulation where the outlier step was turned on vs. rate image and where the outlier step was turned off vs. rate image).

The photometry was measured outside of the pipeline, using a 20 pixel rectangular aperture centered on the source to extract the flux. We think this should be large enough to not be sensitive to losing flux from the PSF sampling. And we used the same size aperture for all images (rate and resampled.). We also did a check where we compared the rate image photometry with the seed image photometry (22nd executable cell) and found those to be largely consistent. So the loss of flux isn't happening from the uncal --> rate stage, but at the rate --> i2d stage.

 

stscijgbot-jp commented 1 year ago

Comment by Stephanie La Massa on JIRA:

Oops, I copied the wrong link! My apologies. Here's where the notebook resides:

/ifs/jwst/wit/niriss/lamassa/pipeline_test/jwst_validation_notebooks/jwst_validation_notebooks/image3/jwst_image3_niriss_test/jwst_image3_outlier.ipynb

stscijgbot-jp commented 1 year ago

Comment by Paul Goudfrooij on JIRA:

Hi Stephanie La Massa - I think Mihai doesn't have access to /ifs/jwst/wit. Might be best to put a copy somewhere in your /user/slamassa area if that works for you? (Or some other place that doesn't require "wit" group membership.)

stscijgbot-jp commented 1 year ago

Comment by Mihai Cara on JIRA:

I do not have access to anything on /ifs/. I need all data and notebooks on either central storage or box.

mcara$ cd /ifs/
-bash: cd: /ifs/: No such file or directory
stscijgbot-jp commented 1 year ago

Comment by Stephanie La Massa on JIRA:

Sorry Mihai! I've copied the notebook, simulations, and reference files here:

/user/slamassa/image3_test

I think you would have access to that directory, but if not, I'll zip up the files and put them on Box for you. 

stscijgbot-jp commented 1 year ago

Comment by Mihai Cara on JIRA:

I have attached a notebook ('niriss-resampled-flux-issue.ipynb') with my own flux measurements and test set-up that will show that in the end (last two printed tables) tests give a very very good agreement between fluxes measured in input images ('cal') and fluxes measured in the resampled images ('i2d'). Computed relative differences in measurements are below 0.1% for most stars (9 out of 10).

The notebook was run on Stephanie La Massa's data in /user/slamassa/image3_test. Limitation of this notebook is that it assumes all input images have equal EXPTIME.

My investigation shows that the discrepancies in your tests are due to the following issues with how you have performed tests (in order of importance):

Many pixels in your images are flagged in DQ as bad pixels. Many stars had their high-intensity pixels flagged. These pixels get thrown away by the drizzle but not by your apertures - you add all pixels regardless of their DQ value. My code in box_photometry() correctly deals with DQ flags through (via) weights. Weights themselves are important too - see 3rd point.

When resampling several images, those "bad" pixels in input images do not appear as "holes" in the output image: they get filled with values in from other images that have valid pixels for that output location. This may result in aperture photometry have higher values than aperture measurements in the input images. You have to perform the aperture photometry on all input images and add them all together using the same weighting scheme that was used for the resample step.

Drizzle adds all pixels using equal weights when weighting is set to 'exptime' and with variable weights (VAR_RNOISE) when using 'IVM' weighting. Even when 'exptime' weighting is used - output (resampled) pixels have different/variable weights due to DQ flags in the input images (flagged input pixels have 0 weight). In order to perform correct comparison, weights should be taken into account when performing photometry.

Due to the way you are doing background subtraction, you get some variation in the background values. I think I perform a slightly more stable/accurate measurement by letting resample step perform sky subtraction by measuring sky over the entire image using skymethod = 'local' and then manually subtracting computed sky background from input images and "single resampled" (i.e., not combined) resampled images. This procedure creates a level field for comparison in different images.

Probably default setting for sky computation ('match') was also introducing some differences that your procedure did not account for.

Fundamentally, your script does not adequately account for all the complexities that you put into your simulation. 

+Outline of the script:+

 

I did not find any issues with the resample step.

stscijgbot-jp commented 1 year ago

Comment by Kevin Volk on JIRA:

You say that we need to use the correct weighting in the aperture photometry, however there is no indication in the step documentation (as far as I have seen) that one needs to apply some weighting in the combined resampled image as compared to any of the original resampled images.  Any user is likely to assume, as I and other members of the NIRISS team did, that one can perform aperture photometry either directly on the individual resampled images from imaging level2 or on the combined resampled images from level3 and get values that match.  The point being, given the solution then this needs to be documented in the pipeline documentation and in JDox for people to be able to correctly derive fluxes from the resampled pipeline images.  We need not give a lot of detail in the pipeline documentation, but some indication of the need for this should be mentioned and one can refer to the JDox page.

The background effect noted is minor and of course one can look at different ways to deal with the background in doing the aperture photometry.

 

stscijgbot-jp commented 1 year ago

Comment by Mihai Cara on JIRA:

There are several issues that you bring up. One is documentation. I think we should be providing enough information for users to understand how drizzle works, i.e., how it combines input images into one output image - see Hook & Fruchter https://arxiv.org/abs/astro-ph/9808087](https://arxiv.org/abs/astro-ph/9808087)), or possible choices for sky background computations, or what does cosmic ray rejection do. Hook & Fruchter + another paper by Stefano Casertano that describes how IVM works, basically describe fundamental ideas of drizzle. INS has created nice data handbooks and a drizzlepac manual for the HST's equivalent - drizzlepac. I do not know who is responsible for telling users how to do their data analysis. Maybe scientists should create a guide book?

I did not imply that aperture photometry in the resampled image should be done using any kind of weighting. That is up to the user/scientist to decide based on their intended use case. IMO this is not necessary for simple things but if some researchers want to do something fancy they may or may not need to use weighting or noise smoothing or whatever they need for their own research. I do not think they need weighting in most cases (to the first order) but I may be wrong.

I have serious doubts that users will be comparing aperture photometry in one of the input image with aperture photometry performed on a combination of input images (unless all pixels are good). That was basically the very first problem with your setup. One cannot just take one image and add all pixels in an aperture without regard to DQ flags and expect to get the same output from the resample step which does take into account DQ — that was point #⁠1 in my previous message. Now, if you do take DQ into account in the input images, now you are in danger of getting larger resampled flux than in resampled image because those bad (missing) pixels in one input image are replaced (filled in) with values of good pixels from other input images. For example, track the source with ID# 1 in tables (B) - for the first table (1b) the flux in the resampled image exceeds the flux in the input image by 28% while it is below input for the rest of the input images. When you combine all of them together - you will get exact match between all input images and the final resample-combined image. That was my point #⁠2 in my previous message. The way of getting a level field when comparing these kinds of images is to use weighting - my point #⁠3 - because the weight of one resampled pixels may be smaller than the weight of other pixels because it's flux comes from only, i.e., one input image and not from two or three or four. But I do not think regular users would ever do this. For example, if you perform aperture photometry on the input 'cal' images - why would one bother to do it on the resampled image too?

Alternatively, you could remove DQ from your input images, replace bad pixel values (negative pixels, cosmic rays, saturated pixels, etc.) with some good values and then you can ignore my point #⁠2 and #⁠3 because all output pixels will have contributions from all input images with the same weight (if you use 'exptime' weighting).

My previous comment was describing what was wrong with the particular test performed in this particular ticket - not with what users need to do. That is, I was explaining the discrepancy that you were getting. ??"Comparison of the source catalogue values when the step is applied to the individual images as opposed to the final combined image showed that the point source brightness values were systematically lower in the combined image"??. — Basically: 1) in general, you cannot compare flux from a single input image with the flux in a resampled image that is a combination of many input images;  2) you cannot ignore DQ flags in your input images when performing aperture photometry in 'cal' images (because resample does not ignore them by default) and if you want to compare the result with resample; and 3) if you do take care of 2), i.e., you do not add bad pixels in your input apertures, now you need to use weighting +if you want to compare+ output with input. This is because resample combines several images some with no contribution to a given output pixel.

Let me finish with addressing this last point. Imagine just two identical images of the same source with 0 background and 0 noise, etc. but in the first image half of the pixels are flagged as bad (their values could be just fine but DQ indicates we need to ignore them). So, if you were to add only the good pixels in the entire first image and do the same in the second image and take their average - you might get any value from 1/2 true flux to 1/1 of true flux depending on which pixels were flagged in the first image (i.e., extremes of the wings of the PSF versus the peak). You could improve your result if you compute weighted mean instead of the arithmetic mean where the weights could be number of good pixels in an image / total number of pixels. But you could get even better results if you could replace "bad" pixels with values from other images. Resample essentially performs both. So comparing flux in the first image with the flux in the resampled image is not straightforward.

But a reference to Hook & Fruchter and Casertano's papers should be included in the docs.

stscijgbot-jp commented 1 year ago

Comment by Mihai Cara on JIRA:

??Any user is likely to assume, as I and other members of the NIRISS team did, that one can perform aperture photometry either directly on the individual resampled images from imaging level2 or on the combined resampled images from level3 and get values that match.??

That is a correct assumption if your input images have no bad pixels AND if you average fluxes from all input images when you compare to flux in the combined resampled image. Averaging is not needed, quite obviously, when comparing to individual resampled images; but no one should be doing aperture photometry in them anyway (this is where weighting produced by the drizzle may become important, especially if there are a lot of flagged pixels). But if your input images do have "bad pixels" - do you just ignore the DQ flags in input images (but not in the resampled-combined image which are described by weights; well, weights carry more info)? Do you replace bad pixel values with some good values (fix bad data) and clear DQ flags? You have to deal with this somehow.

+Ignore DQ:+ To illustrate this, add the following lines to the end of the loop that runs the level 2 pipeline:

    im = ImageModel(simple_cal[k])
    dq = im.dq > 0
    im.dq[:, :] = 0
    im.data[dq] = 0.3 # approximate value of the background but can be anything > 0 (for log10 to work for mag calc)
    im.save(simple_cal[k])```
and also set weights to None in the two calls to `compare_matched_mags()` in the last cell:
```java
        wht1 = None,  # resamp1_wht,
        wht2 = None,  # cal_wht,

This should give you a very good agreement without any weighting if you average input fluxes.

+Average Input Fluxes:+ You still need to average fluxes from all input images (for a given star) because that is what resample basically will do (modulo fine details related to resampling to a different grid which we can ignore here). Different input images have different fluxes due to setting "bad pixels" to 0.3 and "random" simulated noise. Specifically, track the values of a given source (i.e., ID 1) in the 'cal' images — column "Flux2". They are all different even though it is the same source! So how do you decide which input image to use when comparing one input image with a resampled image?

The above changes will show that weighting is not needed when comparing averaged fluxes from all input images with the resampled fluxes and all input pixels are valid.

Another test that you could do is to do PSF-based photometry - see https://photutils.readthedocs.io/en/stable/psf.html]. In a very simplified way, this will fit a model PSF to your input pixels and effectively replace missing ("bad") pixels with some model values. This will reduce differences in the "Flux2" values. Fitting PSFs to input and output images should get close results.

 

stscijgbot-jp commented 1 year ago

Comment by Kevin Volk on JIRA:

As to why one would look at the photometry of the ressampled images as well as the combined one, it is primary to assess whether the uncertainties that come out of the combined image actually reflect the image to image variation seen in the photometry.  None of the individual images is "the truth" but I was expecting the combined photometry to be within the range of the four individual values and it was consistently not in the range.  I need to run the notebook and see what exactly you are discussing, which I will do tomorrow.  Today I was downloaded Stephanie's files to my laptop which took a few hours. 

One issue for me personally is that I do not even remotely understand how drizzle works.  I have made three attempts at various times over the years to look at the HST drizzle documentation and understand what is happening, and in each case I have failed to grasp the basic concept.  So it is a big black box for me, and this makes it difficult to deal with.  Mayhap if I understood drizzle I would not have made the error you are referring to.

stscijgbot-jp commented 1 year ago

Comment by Mihai Cara on JIRA:

I updated my notebook to include an option to erase DQ flags and turn off weighting when performing aperture photometry.

stscijgbot-jp commented 1 year ago

Comment by Kevin Volk on JIRA:

I tried to use the same algorithm as in the notebook to do box photometry with the weights, and also had another version that sets the values below the mean background of 0.305 to a value of 0.305 with no weights.  In both cases I get values that are quite different than the notebook, such as negative signal values for some of the sources in the resampled images.  This is when I use a +/-10 pixel box size.  Changing to +/-25 pixels improves things and the results are closer to the notebook values.  It is a bit perplexing that changing the box size makes so big a difference in the output.  The PSFs are small enough that +/-10 pixels should get most of the signal and not affect the background too much beyond that.  I have to look into this further.

stscijgbot-jp commented 1 year ago

Comment by Stephanie La Massa on JIRA:

Thank you Mihai Cara for your deep dive into this, expert feedback, and sharing with us your testing notebook. It's reassuring to see that setting up an appropriate test reveals that there are no issues with the resample step for the NIRISS imaging data. I think we can take your notebook as a starting point to build a JWST pipeline validation notebook where we systematically check simulations for each filter.

Taking a bigger picture view, it seems clear that I can do a better job working with the NIRISS team to leverage existing knowledge about Drizzle to provide feedback on our testing algorithms as we train up additional expertise within the team. I've for sure learned a lot by your detailed responses and I look forward to playing with your testing notebook and transforming it into a validation notebook to streamline future testing.

(And as an aside, your suggestion that scientists should create a guide book to help users with their data analysis is an excellent one! Indeed, work is underway by the JDox team and JDAT groups to do just that.)

stscijgbot-jp commented 1 year ago

Comment by Mihai Cara on JIRA:

Sorry for my lengthy explanations but your (or Kevin) simulations are too complex in the sense of enabling multiple factors that affect aperture measurements in different ways and I had to disentangle them to get good results that exonerate the resample. So I wanted to explain all of this but I am not talented at being succinct.

stscijgbot-jp commented 1 year ago

Comment by Mihai Cara on JIRA:

?? ... such as negative signal values for some of the sources in the resampled images.  This is when I use a +/-10 pixel box size.  Changing to +/-25 pixels improves things and the results are closer to the notebook values.  It is a bit perplexing that changing the box size makes so big a difference in the output.??

I am not sure about your exact setup in those notebooks but I have noticed that your simulated files contain some pixels that are "very negative" and sometimes these clusters of negative pixels are close to some stars. These negative pixels are flagged in the DQ and so resample ignores them but your box apertures did not. If those are not ignored, those could produce negative fluxes and result in a crash when computing magnitudes due to log10(signal). I suspect larger apertures have the following effects: 1) they add a lot of background signal drowning the variations due to flagged high intensity pixels and maybe also 2) it reduces edge effects from the boxy geometry of your apertures. I think a symmetrical aperture would be better.

With regard to the point 1) what I mean is the following. Let's say resampled photometry is the "truth" and the one in input images has errors due to not throwing out or accounting for bad pixels. I'll call that flux small_ap = truth + err. The relative error is rel_err_sm=|small_ap-truth|/truth. A larger aperture will have same flux as small plus extra background flux: large_ap = small_ap + bkg (bkg>0, at least in the resampled image). Then rel_err_lg=|(truth+err+bkg')-(truth+bkg)|/(truth+bkg) which is likely smaller than rel_err_sm (should be true if outer "ring" of the aperture in the 'cal' image does not include very negative pixels).

stscijgbot-jp commented 1 year ago

Comment by Kevin Volk on JIRA:

These pixels with large negative slopes were another issue that was fixed in the pipeline, but was present in the original simulation.  Unfortunately since I knew from my own ramp slope code that no such values should be present in the _cal.fits files I did not look for such cases at the time.  I did verify that no such large negative values are present in the _i2d.fits files from the test so that is not the cause here.  I need to investigate it further.  Something in the background subtraction may be going wrong.

 

stscijgbot-jp commented 1 year ago

Comment by Mihai Cara on JIRA:

??no such large negative values are present in the _i2d.fits files??

True. That is because resample does not "resample" bad pixels - those flagged in DQ. I was talking about input images - the 'cal' images. At least the ones shared by Stephanie La Massa – the ones I used in my tests – do contain negative values.

stscijgbot-jp commented 1 year ago

Comment by Mihai Cara on JIRA:

Updated the notebook to allow circular apertures – see parameter aperture_shape. No significant differences noticed compared to 'box' apertures.

stscijgbot-jp commented 1 year ago

Comment by Kevin Volk on JIRA:

The original issue was that the combined _i2d.fits file gave a systematically result in the photometry than what one gets from the individual _i2d.fits files.  I had noted more scatter in the _i2d.fits files than in the cal.fits files but there was no systematic difference overall.  Why are the resample step and the source_cataog step outputs turned off in cell 10 where level3 is run?

stscijgbot-jp commented 1 year ago

Comment by Mihai Cara on JIRA:

Could you, please, clarify last two sentenses (the first is likely missing "lower" in "systematically result"). 

??I had noted more scatter in the _i2d.fits files than in the cal.fits files but there was no systematic difference overall.??

Scatter in what? When comparing what to what? Also, 2nd part, do you see systematic differences now but did not see them before? It just could be my English as I am stuck on "was" being in the past.

??Why are the resample step and the source_cataog step outputs turned off in cell 10 where level3 is run???

Are you talking about my notebook? [please download latest file as I have updated it 2 days ago to allow for circular apertures and possibility to do aperture photometry without weighting when DQ is not involved]. I am asking this because locally (in the notebook in front of me), cell 10 contains Image2 stage. I am not turning off anything there. In cell 11, which contains Image3 stage, source_catalog is off because it has no relevance to aperture photometry. I do not need to run steps that I do not need for what I am testing. All that I wanted to check was that aperture photometry in the resampled image produces same results as aperture photometry on input 'cal' images (when performed correctly by combining results from all input images and discarding flagged DQ pixels). There is no way I could have turned off  "resample step" when testing resample.

stscijgbot-jp commented 1 year ago

Comment by Kevin Volk on JIRA:

The original comparison I did (badly, apparently) was first to compare the _cal.fits images to the _i2d.fits images individually, where I saw no overall difference in the values but where the scatter in the _i2d.fits files was larger than in the _cal.fits values--these were being compared to the the original Mirage input seed images and the associated source catalogue.  Then I compared the combined _i2d.fits file photometry to the individual resampled _i2d.fits files photometry and saw a systematic discrepancy in the values with less signal in the combined _i2d.fits image than in any of the individual _i2d.fits images.  Hence I was wondering about that comparison using the proper way to calculate the photometry with the weights.  In what is cell 10 in the version I am looking at, where level 3 is run, has the following two lines:

result_im3_no_out.tweakreg.skip = True result_im3_no_out.source_catalog.skip = True

This, it seems, is now cell 11 in your version.  This is skipping the tweakreg step, which I should have said instead of referring to the resample step being turned off.  Maybe these have been turned on again in your current version.  I was wondering why no combined resampled image is being produced.  I will download the revised version of the notebook and see what has changed.  I did run these steps today and got the combined resampled _i2d.fits image out, now I want to measure it against the individual ones to see if the issue I had was due to not allowing properly for the bad pixels.

 

stscijgbot-jp commented 1 year ago

Comment by Mihai Cara on JIRA:

Very strange if combined image is not produced. I will double check this but the test would crash if no combined image were created, I hope... I will look at the comparison between individual vs combined. One thing that might be going on was that when resampling just one image what were bad pixels in the input might be filled with good values from adjacent pixels that fall (partially) on 2-4 output pixels. So, imagine one input pixel with a value of 1 and all other pixels set to 0, might land on 4 output pixels producing output 4 pixels of value 1 => a total flux 4x times the input. This is when weighting is more important.

I was not sure you have put pointing errors into your simulations but even if you did and the errors were not of 1 mile magnitude, with outlier rejection off, it does not matter. But really, if you are checking whether resample preserves fluxes, you should simplify your simulation at least to eliminate issues related to alignment or source finding which would introduce other unknowns.

Ah, I re-read your message and in the end it seems like you did get the combined image. Tables "a" compare individual i2d with cal (table 5 is the mean and tables "b" compare combined i2d with each input "cal" fluxes (table 6 is the mean of inputs). So you could just compare Table 5 & 6 or columns "Flux1" in tables "a" and "b". 

stscijgbot-jp commented 1 year ago

Comment by Kevin Volk on JIRA:

I originally expected the combined image photometry to possibly have slightly higher signal than the individual images, for the reason you mention here about filling bad pixels with good values.  That may be the result now, I hope I can get this working today.

The simulation here does not have a pointing error in the individual images, I believe.

 

stscijgbot-jp commented 1 year ago

Comment by Mihai Cara on JIRA:

I updated the notebook to fix a bug in the Image2 stage which did not set the kernel for the resample (but it did for the Image3 stage). This could lead to different single and combined 'i2d' resampled images due to kernel differences (Image 2 was using default 'square' kernel to create 'i2d' images in my old version of the notebook).

stscijgbot-jp commented 1 year ago

Comment by Howard Bushouse on JIRA:

Wondering where folks want to go from here with this work. I think we've at least confirmed that there's nothing fundamentally wrong in the drizzling being done by the resample step (not surprising, since drizzle has been in use for decades for many, many research projects). So perhaps any remaining work on this topic is simply along the lines of creating or updating some documentation, in order to layout in a more obvious way to users how aperture photometry should be performed on resampled images? #6714 has already been created to simply add a few more references for the drizzle algorithm to the RTD docs for the resample step.

If folks are agreeable, I propose closing this ticket (assuming everyone agrees that the investigation phase is complete), and if necessary create a new ticket to cover documentation.

stscijgbot-jp commented 1 year ago

Comment by Kevin Volk on JIRA:

I am not able to get reliable results that are not sensitive to the aperture size, even doing all the same things as Mihai by copying his code.  However, the ticket can be closed as far as I am concerned.  What with commissioning I am not going to be able to devote time to understanding why the results are sensitive to the aperture size even when using the weights and/or masking the bad pixels to zero. 

stscijgbot-jp commented 1 year ago

Comment by Mihai Cara on JIRA:

Kevin Volk Could you please elaborate what kind of aperture sizes are you talking about? I re-run my notebook using 5, 10, 15, 25, 35 pixel radii and I get consistently differences between 'cal' and combined 'i2d' fluxes are very low, mostly < 0.1-0.2%, and only approaching 0.5-0.7% at very small apertures of 5 pixels radius for two sources only:

+Radius = 5 pixels:+ ||id||x1||y1||Flux1||Flux2||Diff(%)||Diff(mag)|| ||int64||int64||int64||float32||float32||float32||float32|| |1|1843|326|12.982|13.04|0.451|0.004882| |2|1103|556|14.594|14.632|0.259|0.002813| |3|506|621|12.197|12.194|0.029|-0.0003156| |4|1518|792|14.387|14.41|0.158|0.001718| |5|558|1008|13.851|13.86|0.062|0.0006776| |6|1480|1009|13.898|13.896|0.014|-0.0001538| |7|944|1293|14.693|14.73|0.258|0.002797| |8|548|1539|14.863|14.865|0.013|0.0001439| |9|1732|1738|14.432|14.431|0.005|-5.527e-05| |10|426|1840|14.674|14.785|0.756|0.008177|

 

+Radius = 10 pixels:+ ||id||x1||y1||Flux1||Flux2||Diff(%)||Diff(mag)|| ||int64||int64||int64||float32||float32||float32||float32|| |1|1843|326|3.6882|3.692|0.103|0.001116| |2|1103|556|4.118|4.1227|0.115|0.00125| |3|506|621|3.41|3.4107|0.023|0.0002512| |4|1518|792|4.0968|4.0963|0.013|-0.0001437| |5|558|1008|3.889|3.892|0.079|0.0008584| |6|1480|1009|3.9637|3.9648|0.026|0.0002812| |7|944|1293|4.175|4.1703|0.114|-0.001235| |8|548|1539|4.1647|4.1664|0.041|0.0004425| |9|1732|1738|4.06|4.057|0.073|-0.0007949| |10|426|1840|4.1315|4.1445|0.314|0.003399|

 

+Radius = 15 pixels:+ ||id||x1||y1||Flux1||Flux2||Diff(%)||Diff(mag)|| ||int64||int64||int64||float32||float32||float32||float32|| |1|1843|326|1.6822|1.6834|0.071|0.0007732| |2|1103|556|1.8798|1.8799|0.004|4.64e-05| |3|506|621|1.5343|1.5354|0.071|0.0007698| |4|1518|792|1.8784|1.8769|0.082|-0.0008878| |5|558|1008|1.7671|1.7673|0.011|0.0001245| |6|1480|1009|1.8204|1.8193|0.061|-0.0006626| |7|944|1293|1.9018|1.9023|0.027|0.000297| |8|548|1539|1.8964|1.8983|0.099|0.001076| |9|1732|1738|1.8591|1.8585|0.036|-0.0003873| |10|426|1840|1.88|1.8838|0.201|0.002179|

 

+Radius = 25 pixels:+ ||id||x1||y1||Flux1||Flux2||Diff(%)||Diff(mag)|| ||int64||int64||int64||float32||float32||float32||float32|| |1|1843|326|0.62543|0.6259|0.075|0.0008123| |2|1103|556|0.69606|0.69632|0.037|0.0004034| |3|506|621|0.54621|0.5468|0.107|0.001156| |4|1518|792|0.70299|0.70257|0.061|-0.0006629| |5|558|1008|0.65896|0.65924|0.043|0.0004633| |6|1480|1009|0.68566|0.68513|0.076|-0.000826| |7|944|1293|0.70941|0.7098|0.054|0.0005906| |8|548|1539|0.70317|0.70393|0.107|0.001166| |9|1732|1738|0.68553|0.68541|0.017|-0.0001849| |10|426|1840|0.69511|0.69544|0.048|0.0005231|

 

+Radius = 35 pixels:+ ||id||x1||y1||Flux1||Flux2||Diff(%)||Diff(mag)|| ||int64||int64||int64||float32||float32||float32||float32|| |1|1843|326|0.32491|0.32503|0.037|0.0004043| |2|1103|556|0.35823|0.35814|0.025|-0.000274| |3|506|621|0.26221|0.26264|0.166|0.0018| |4|1518|792|0.36564|0.36541|0.061|-0.0006677| |5|558|1008|0.33783|0.33794|0.032|0.0003461| |6|1480|1009|0.35717|0.35679|0.106|-0.001156| |7|944|1293|0.36707|0.36726|0.051|0.0005549| |8|548|1539|0.36022|0.3605|0.079|0.0008575| |9|1732|1738|0.35008|0.35003|0.014|-0.0001478| |10|426|1840|0.35714|0.35751|0.103|0.001117|

 

Looking at the "Diff" column, I really do not see the "unreliable" results that you are claiming. At very small apertures the larger errors are probably due to edge effects and centering of the aperture (original image vs resampled).

Kevin Volk What kind of discrepancies and inconsistencies are you finding?

stscijgbot-jp commented 1 year ago

Comment by Kevin Volk on JIRA:

You are looking at the mean signal per pixel and doing comparisons.  What I was doing was looking at the total signal as a function of aperture, and there I was seeing variations that I think are due to issues in exactly how the background is being treated and which pixels are in or out for different apertures.  So I am not doing the same type of comparison as you are.  I think it is a function of the background level and that this is changing a bit depending on what pixels are in or out of the background aperture, so it is probably not a drizzle issue.  But I was seeing cases where the total signal out would change by 10% or some such value in changing the aperture size by 1 pixel so I assume some individual pixel was going in or out to change the background and hence the photometry.

stscijgbot-jp commented 1 year ago

Comment by Mihai Cara on JIRA:

Isn't "mean signal per pixel" * "number of pixels" == "total signal"  ?

Also, based on my test and since you suspect that the variations that you observe are due to 1 pixel getting in or out thus changing the background, for the very least we can conclude that resample is not the root-cause of the differences that you observe in your comparisons.

CC: Howard Bushouse

stscijgbot-jp commented 1 year ago

Comment by Kevin Volk on JIRA:

Yes indeed this thing that I am seeing is not related to resampling and hence the ticket can be closed.

 

stscijgbot-jp commented 1 year ago

Comment by Howard Bushouse on JIRA:

It appears that this ticket also got reopened by the github/Jira bot, for some unknown reason, on 14/Oct/2022. If anyone intentionally reopened it, please let me know, otherwise I'm going to close it again.

stscijgbot-jp commented 1 year ago

Comment by Rosa Diaz on JIRA:

Closing issue because there was no answer about this being an issue anymore and Howard mentioned that the bot had reopened it.