astrorama / SourceXtractorPlusPlus

SourceXtractor++, the next generation SExtractor
https://astrorama.github.io/SourceXtractorPlusPlus/
GNU Lesser General Public License v3.0
72 stars 9 forks source link

Problems applying SE++ to a VIS coadd image #469

Open mkuemmel opened 2 years ago

mkuemmel commented 2 years ago

When fitting a bulge+disk model to VIS coadded data the comparison between auto_mag (== TU mags) and the fitted mags looks like this: VIS_fitting The PSF is a bit naive (2.0pix Gaussian). I tried a lot but could not significantly improve the offset(s) and I wonder whether the known deficiencies in the data and the setup can explain the large offsets.

The data is here: https://deepdip.iap.fr/#folder/624e9eeb28cafb12e8553f0c

mkuemmel commented 2 years ago

I am getting kind of desperate about this issue. In the last weeks I really tried a lot:

At the end of the day I am using the configuration files from the morphology challenges on the Euclid coadd and the Morphology Challenge image and get this comparison with auto_mag, which is comparable to the TU mag on this level: fitting_problem

It works on the MC image but not on the Euclid coadd. In the red cloud in the upper right the fitting basically gets zero flux, hence the humongous offset. Even if this cloud would not be there the read points would still be bad.

WillHartley commented 2 years ago

Screenshot from 2022-06-03 20-55-19

Top left: residual Top right: model Bottom left: science image.

From the residual it looks like the model fitting is not completely breaking down, the negative pixel value in the centre is around a quarter of the flux of the original image - so the model is 25% too high in the centre - not too unusual I guess. But that model-subtracted flux doesn't appear at all in the model checkimage. (the colour bars are matched across the frames).

marcschefer commented 2 years ago

I have done some tests last Friday, the model fitting is clearly happening so it's not like something is completely wrong. Overall the model check image looks mostly correct by eye, just a bit off in flux.

Could it be a PSF related problem? Maybe the variable PSF is not being correctly applied?

mkuemmel commented 2 years ago

As discussed in the telecon, I ran the dataset with 3 different PSF models (Gaussian, psfex, coadded PSF). The offsets are similar in all three cases.

In another test I used a constant RMS image with the background RMS value. Did not help.

mkuemmel commented 2 years ago

I did a fairly complete parameter study and changed, from a baseline solution, all conceivable parameters that could have an effect on the fitting (can provide a protocol if desired).

The only parameter that changed the photometry offset significantly was changing set_modified_chi_squared_scale=0.01 (see #487) then the offset range changes from [0,15mag] to [-2,2], which is not good as well.

mkuemmel commented 2 years ago

I made a small script to derive the reduced chi-square value from output imaging material (segmentation image, residual image, rms image). When I compare the reduced chi-2 values from the SE++ fitting with derived reduced chi-2 values I get this: chisquare_comparison For the Morphology Challenge data both chi-2 estimates are in the right range. I do not expect a close correlation, but the numbers are around in the right ballpark. For the problematic dataset the reduced chi-2 values from the fit are way smaller than expected (~0.2), but the derived values are about a factor 10 larger. Looking at the residual image, the derived chi-2 values of 2 and larger are more realistic than the tabulated values <<1 which indicates overfitting.

At the end of the day, I cant understand why the fitting ends up with so small chi-2 values and I can not reproduce them even qualitatively.

mkuemmel commented 2 years ago

Turns out that the fit are kind of reasonable: image when the chi-2 scale is changed to: set_modified_chi_squared_scale(0.003) With this settings a disk+bulge fit works reasonably well with Gaussian PSF, a PSFEX file and the coadded PSF. What is not clear is:

It could be that the missing Poisson noise in the RMS or the correlated noise (coadded data) is responsible for the strange behaviour.

mkuemmel commented 2 years ago

As discussed yesterday I added Poisson noise to the RMS and ran SE++ with the PSFEX model (with the default mod_scale). The results are a bit better, but not really good: noise_comparison Also when setting the set_modified_chi_squared_scale(0.003) the fitting works.

The Poisson noise was added with: poisson_noise = rms_data+numpy.sqrt(numpy.fabs(img_data)*exp_time)/exp_time

mkuemmel commented 2 years ago

Lastly, I checked whether providing a gain value does help. I do that by giving explicitly the value --detection-image-gain 3.5 -- and --weight-type background in both the ASCII and python configuration files. fit_gain Also that one did not help....

WillHartley commented 2 years ago

It's really persistent isn't it.... Is there some way to translate the chi-sq scale param to an effective gain?

Depending on how the coadd was constructed, you might want to set the gain to G = gain * total_exp_time

Because the counts are scaled to 1s of exposure in many coadd images. (or something like that - Emmanuel can correct me)

mkuemmel commented 2 years ago

Happy to check that out if I know exactly what and how.

The image indeed has an effective exptime of 1s.

There are 4 exposure with 565s each, that would mean: G = 3.5 4 565. = 7910.0

Would that be correct (that's quite a high value, similar to inf.=0.0)?

mkuemmel commented 2 years ago

Running with the very large gain value is so far the only reduction with a reasonable result for the photometry and without the modified chi-2 scaling: fit_gain7910

Actually the result seems to be even better than the the ones above withe the modified chi-2 scale.

WillHartley commented 2 years ago

This seems like the answer. I don't recall exactly what Emmanuel said on the call, but I think the essence of it was that the chi-2 scale was doing the same thing as the gain would do. Maybe if you set the chi-2 scale to 1/7910 = 1.26e-4 you'd get a similar result?

WillHartley commented 2 years ago

Emmanuel should probably check this, but I think what you want is,

poisson_noise = numpy.sqrt(numpy.fabs(img_data)*exp_time*gain)/(exp_time*gain)

total noise = numpy.sqrt(poisson_noise**2 + background_rms**2)

ebertin commented 2 years ago

If gain is the effective gain (not the original instrumental gain) and img_data the pixel value with background subtracted, then an estimate of the standard deviation of the total noise would simply be total_noise_rms = numpy.sqrt(numpy.fabs(img_data) / gain + background_rms^2)

mkuemmel commented 2 years ago

With this formula:

total_noise_rms = numpy.sqrt(numpy.fabs(img_data) / gain + background_rms^2) the noise gets very large and the 'old' detection parameters are kind of obsolete.

Shouldn't there be the exposure time in that equation?

WillHartley commented 2 years ago

Emmanuel's formula is a simplification of the one I wrote, where he's using the effective gain,

(effective) gain = gain * exposure time

So it is in there implicitly. Does that clear it up, or where you thinking it should appear somewhere else?

ebertin commented 2 years ago

No, exposure time is out of the equation because the GAIN in the FITS header is assumed to be the effective gain, not the original detector gain before data rescaling. Other software might interpret the GAIN keyword differently, but this is how the original SExtractor did. Note that exposure time might not be the only contributor to changes to the effective gain (for instance if the data producer wants the results in other units, e.g., eV), so anyway the most convenient and safest way I think is to assume that GAIN is the effective gain.

mkuemmel commented 2 years ago

I was using the above formula(s) with: (effective) gain = gain exposure time = 3.48 4 * 565. The gain comes from the calibrated images and the exposure time is clear. This leads to: VIS_poisson_comp

That's very similar to other results above modifying the chi-square scale. I also ran with the variations of the modified gain (factor 2 in both directions). That changes little.