Reproducibility of the segmentation maps and hence the detected objects

astrorama / SourceXtractorPlusPlus

SourceXtractor++, the next generation SExtractor

https://astrorama.github.io/SourceXtractorPlusPlus/

GNU Lesser General Public License v3.0

72 stars 9 forks source link

Reproducibility of the segmentation maps and hence the detected objects #255

Closed mkuemmel closed 4 years ago

mkuemmel commented 4 years ago

On subsequent runs of SourceXtractor++ the segmentation map is not identical, and some pixels change from 'not detected' to 'detected and vice versa.

The segmap is derived from the thresholded image, which is defined as: thres_image = det_image - sqrt(var_image)*threshold_value

It is: det_image = convoluted background subtracted image and: var_image = convoluted variance image

Pixels with thres_image>0.0 are detected and get into the segmaps.

The results for the segmaps is consistent with the thresholded image, hence the problem is already present in the thresholded image.

the background is not an issue since the example was run with background_value=0.0
the effect is quite rare, ~1/10^7 pixels in my test setup
the difference image of two thresholded images shows the imprint of the tile-size:

The sharp lines reflect the borders of the tiles.

ayllon commented 4 years ago

I can't test right now because my CPU is on fire running nnpz, but I have an hypothesis: the convolution. If the segmentation filter is > 5 pixels in width, a fast Fourier transform is used.

We use fftw_plan_many_dft with the flag FFTW_MEASURE. This might introduce some not deterministic behavior, if different plans are picked between runs.

Convolution is done tile per tile (extending into the neighboring tiles to fill the padding), so that would explain the imprint.

If you want to verify this in the meantime, you can go to BackgroundConvolution.cpp and disable BgDFTConvolutionImageSource

mkuemmel commented 4 years ago

I switched off some advanced compiler flags "-O3 -ffast-math" in cmake/ElementsBuildFlags.cmake. Then there is in e.g. ./build.x86_64-co7-gcc48-o2g/SEMain/CMakeFiles/SEMain.dir/flags.make: CXX_FLAGS = -fmessage-length=0 -pipe -Wall -Wextra -Werror=return-type -pthread -Wpedantic -Wwrite-strings -Wpointer-arith -Woverloaded-virtual -Wno-long-long -W no-unknown-pragmas -fPIC -ansi -std=c++11 -Wno-deprecated -Wno-empty-body -O2 -g -DNDEBUG -fPIC

But that does not change anything. The thresholded images of different runs are not identical and lead to differing segmaps.

mkuemmel commented 4 years ago

I can't test right now because my CPU is on fire running nnpz, but I have an hypothesis: the convolution. If the segmentation filter is > 5 pixels in width, a fast Fourier transform is used.

We use fftw_plan_many_dft with the flag FFTW_MEASURE. This might introduce some not deterministic behavior, if different plans are picked between runs.

Convolution is done tile per tile (extending into the neighboring tiles to fill the padding), so that would explain the imprint.

If you want to verify this in the meantime, you can go to BackgroundConvolution.cpp and disable BgDFTConvolutionImageSource

Thats an exellent idea, since I am using large filters.

mkuemmel commented 4 years ago

Some more findings from this mornings:

the differences yesterday I found on a small EDEN2.0 VM with not much RAM and very busy;
since this morning I am running tests on a rather potent EDEN2.0 machine at the SDC-DE; an initial test found only 5 differing pixels (in 19kx19k pixels), since then I was running about 10 further tests, but now the resulting thresholded images are identical;
the 10 test images from 2. were done with different tile-size and tile-memory, but the thresholded images are identical;

mkuemmel commented 4 years ago

Yep, that's it.

I switched the FFT convolution off and then I get identical threshold images in subsequent runs. I switched it back on, and then the differences appear again.

Is there a way to make the fft deterministic?