transientskp / pyse

Python Source Extractor
BSD 2-Clause "Simplified" License
11 stars 5 forks source link

Map the RMS over the full image #67

Closed AntoniaR closed 2 weeks ago

AntoniaR commented 3 months ago

There is an issue with false positive detections around the source extraction radius. This is caused by the rms grid method underestimating the RMS around the extraction radius as it only covers the extraction region. The false positives vanish when the source extraction radius is expanded.

The solution is to either map the RMS over the full image and then only extract sources within the extraction radius. If this is inefficient, we could simply use a larger radius than the extraction radius (e.g. by minimum of 1 grid box) to measure the rms over.

This is related to the TraP issue https://github.com/transientskp/tkp/issues/381

HannoSpreeuw commented 3 months ago

I am in doubt if this problem would still occur with the current master branch, I would need an example image with settings for source extraction to reproduce it.

The reason is that the ImageData class provides for a margin and/or a radius. These can be provided from the command line. The combination of margin and radius is turned into a mask.

That mask is not effectuated when the grid of background mode and background standard deviations (rms values) is computed, as it should be. This line shows that unmasked image data are being used to determine the background grid:

useful_data = da.from_array(self.data[useful_chunk[0]].data, chunks=(self.back_size_x, y_dim))

The use of the .data attribute at least suggests the use of unmasked data.

The image mask is applied when the grid values are interpolated.

In summary, looking at the current code of from the master branch, the calculation of the background mode and rms grid values should not be compromised by any mask, such as the mask from the source extraction radius. So I do not yet comprehend where underestimates of the background noise - from a small extraction radius - should be coming from.

But this is just what I can infer from staring at the code, there may very well be an oversight in my reasoning. So any image with parameter settings to reproduce this would be helpful.

AntoniaR commented 3 months ago

I know this used to be a problem but it may indeed be fixed. I will try to find and create a test dataset to check if this is still an issue. Thanks!