astropy / photutils

Astropy package for source detection and photometry. Maintainer: @larrybradley
https://photutils.readthedocs.io
BSD 3-Clause "New" or "Revised" License
250 stars 138 forks source link

Initial support for PSFEx PSF #1913

Open navii98 opened 1 month ago

navii98 commented 1 month ago

Closes #1728

I've tried to implement support for the most straightforward case of variable psf model from PSFEx. PSFEx can model variation for any image/header parameter, but I've only implemented variation in x and y (X_IMAGE and Y_IMAGE from SExtractor).

This pull request is not complete, but I just wanted to get feedback on the early stages. I've based it on current master after the ImagePSF and GriddedPSFModel classes. The core functionality is in the for functions _calc_poly_coeffs, _calc_image_weights, _calc_model_values, and evaluate. I've tried to create a PSF model using this and perform photometry on an image for which the PSF was modeled by PSFEx, and it worked, to my surprise. Although it should be able to handle the case of a constant psf (variation deg=0), I believe that would be better handled by the ImagePSF class. This can be done in the reader.

Some things still left to do -

I had some questions regarding the implementation of PSFmodel classes in general -

  1. In the current implementation, for every call to _calc_model_values, -polynomial coefficient from x and y are calculated -multiplied with a stack of images (see #1728 for details) -these images are then summed -An interpolator is defined using the summed images, and (xi, yi) are passed to the interpolator In your experience, would it be more efficient to create an interpolator for every image, interpolate first on (xi, yi) using every interpolator (6 interpolators for vardeg=2; 1, x, x^2, y, xy, y^2), and then sum?

So the question is if creating an interpolator instance is more expensive or if the interpolation itself is more expensive.

  1. Why limit the oversampling to an integer?
  2. In the new ImagePSF class, it is mentioned that the PSF should be normalized so that the flux is 1. Is it necessary, in particular for other classes? The sum of the PSF model from PSFEx is rarely 1.0.
  3. Does the choice of origin have any effect in the final PSF photometry? except the x_fit and y_fit parameters?
pep8speaks commented 1 month ago

Hello @navii98! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 468:1: W293 blank line contains whitespace Line 476:9: E124 closing bracket does not match visual indentation

Comment last updated at 2024-10-06 15:09:27 UTC
larrybradley commented 1 month ago

@navii98 Thanks for the pull request. I've never used PSFEx.

So the question is if creating an interpolator instance is more expensive or if the interpolation itself is more expensive.

It depends on the size of the images, but generally it's more expensive to create the interpolator than to evaluate it. I recently changed the implementation of GriddedPSFModel to do exactly that and it's about 4x faster.

Why limit the oversampling to an integer?

For image-based models, how would you create a non-integer oversampled image? Typically, an oversampled PSF model is evaluated by sampling every (integer) oversample pixels. I've never encountered non-integer oversampling factors. Do you have a particular use-case in mind?

In the new ImagePSF class, it is mentioned that the PSF should be normalized so that the flux is 1. Is it necessary, in particular for other classes? The sum of the PSF model from PSFEx is rarely 1.0.

I think you misunderstood. The docs state:

The array must be normalized so that the total flux of a source is 1.0. This means that the sum of the values in the input image PSF over an infinite grid is 1.0. In practice, the sum of the data values in the input image may be less than 1.0 if the input image only covers a finite region of the PSF. These correction factors can be estimated from the ensquared or encircled energy of the PSF based on the size of the input image.

The sum of the values in the PSF model can be less than 1, but that PSF should represent a total flux of 1 when a source is fit. In the PSF photometry tools, the flux parameter is simply a multiplicative scale factor applied to the model. Thus, to get sensible flux fit parameters, the input model should correspond to a total flux of 1 (but it does not have to sum to 1). They can be different, for example, if the input model does not cover the entire PSF (which extended to infinity would sum to 1).

Does the choice of origin have any effect in the final PSF photometry? except the x_fit and y_fit parameters?

Yes, the origin corresponds to the coordinates within the image which will be placed at the model x_0 and y_0 coordinates. Typically the model image should have a peak at the center of the image, thus the origin coordinates should be at the center. If the origin was shifted, the fit (x, y) positions will not be what you expect.

If you are having trouble with the linter and style checks, I recommend that you install pre-commit. See https://docs.astropy.org/en/latest/development/development_details.html#pre-commit for details.

Since this PR is still in progress, I'm going to convert it to draft status.

navii98 commented 1 month ago

I had written the basic information about output PSF from PSFEx in the related issue.

It depends on the size of the images, but generally it's more expensive to create the interpolator than to evaluate it. I recently changed the implementation of GriddedPSFModel to do exactly that and it's about 4x faster.

Oh, that's interesting. I just assumed it would be the other way around. But I'll try to edit the code so that the arithmetic is done on interpolated instances and see if there are any performance enhancements.

For`` image-based models, how would you create a non-integer oversampled image? Typically, an oversampled PSF model is evaluated by sampling every (integer) oversample pixels. I've never encountered non-integer oversampling factors. Do you have a particular use-case in mind?

PSFEx, with it's default configuration samples the PSF at intervals of FWHM/4. Consequently, an appropriate sampling step for the PSF model would be 1/4th of the PSF FWHM. This is automatically done in PSFEx, when the PSF_SAMPLING configuration parameter is set to 0 (the default). The PSF sampling step may be manually adjusted (in units of image pixels) by simply setting PSF_SAMPLING to a non-zero value. However, this can be defined as an integer as well, which is what I've put in my _validate_data function.

The sum of the values in the PSF model can be less than 1, but that PSF should represent a total flux of 1 when a source is fit. In the PSF photometry tools, the flux parameter is simply a multiplicative scale factor applied to the model. Thus, to get sensible flux fit parameters, the input model should correspond to a total flux of 1 (but it does not have to sum to 1). They can be different, for example, if the input model does not cover the entire PSF (which extended to infinity would sum to 1).

Thanks for the clarification. I understand the rationale behind having the PSF normalized to a flux of 1.0 But what about the case when the sum of PSF is >1 (e.g. In [17]: psf_data.sum() Out[17]: 1.0731393). I could not find out why that is because the documentation os sparse for PSFEx. But it shouldn't cause any issues for photometry, right? The final flux_fit will be a multiplicative of the actual flux?

If the origin was shifted, the fit (x, y) positions will not be what you expect.

Okay. I asked in the case we have a PSF with even number of pixels. But 1/2 a pixel shift out of 25 shouldn't cause that much of an issue.

Since this PR is still in progress, I'm going to convert it to draft status. Okay. I expected to finish it soon, but got very busy last week. Anyways, I'll mark it for review when I finish it from my side.