astrorama / SourceXtractorPlusPlus

SourceXtractor++, the next generation SExtractor
https://astrorama.github.io/SourceXtractorPlusPlus/
GNU Lesser General Public License v3.0
72 stars 9 forks source link

SEPP 0.16 fades away #438

Closed mkuemmel closed 7 months ago

mkuemmel commented 2 years ago

I updated the MER PF to using v0.16. Turns out that the first VIS image run in a PPO created a problem.

There is no error, but SEPP just does fade away and does not produce anymore objects and down to 0% CPU.

The image does run with v0.12 and v0.15 (I am assuming also with 0.13 and 0.14).

Whether running in EDEN-2.1 or in an anaconda distribution, the last log comes at 25.46% segmentation and then nothing happens for hours. There is a fairly large assembly of bright stars in the lower right of that image that likely are related to that object - the objects stored in the preliminary catalog almost go enclose that blob.

I put the necessary data and a shell script running SE++ at irods at: /euclid-fr/home/pfmerwrite/SEPP/fadeAway

mkuemmel commented 2 years ago

I created a cutout image around that blob. Running on that one results in the error:

2022-01-19T10:18:31CET SourceXtractor FATAL : boost::bad_any_cast: failed conversion using boost::any_cast

Another, even smaller cutout runs through.

I uploaded the cutout images. The error happens after ~7mins.

marcschefer commented 2 years ago

Was grouping off before? Since it's on by default now (as well as multi-thresholding) that could explain changes in behaviors. Normally there should be no major changes to detection functionality in 0.16.

How can I get this test data?

mkuemmel commented 2 years ago

I have been using moffat grouping with all versions.

The test data is on the euclid irods. Details are on: https://euclid.roe.ac.uk/projects/ousdcd/wiki/TestDataHosting

In irods the path is: /euclid-fr/home/pfmerwrite/SEPP/fadeAway

If this is too complicated, the cutouts I can put on wetransfer or so.

I know that there are no major changes in v0.16. Measurement scheduling was changed in v0.16, that would be my guess.

ayllon commented 2 years ago

Going all of the sudden down to 0% makes me think that the worker threads crash. Might be Pyston fault if you are using model fitting.

mkuemmel commented 2 years ago

Nope, no model fitting.

mkuemmel commented 2 years ago

I tried two things:

  1. starting from a 'fresh' configuration file (dump a v0.16 config file and take over the old parameters);

  2. changing the lutz window size to 0 and switching off max distance for moffat grouping

My hope was that the windows/distance params have an influence. But Nope. 2. has an influence on when it fades away (26.10% segmentation instead of 24.56), but it still fades away.

In the end I don't think the problem is a 'bad' parameterization.

marcschefer commented 2 years ago

@mkuemmel I tracked down the any_cast issue to the Moffat model fitting. It was no longer necessary and I removed it and now both your tests work on my machine. But I'm not convinced this is what caused your original problem. So if you could please test with that branch (PR #439) and share the results that'd be great. Thanks.

mkuemmel commented 2 years ago

Well, actually it does work. I used v0.16 and patched it. Now the cutout image and the large image both run through. There is a huge warning blob when compiling, but that is also there with the old line. Was that it?

marcschefer commented 2 years ago

So I guess that was the problem, the Moffat fit failing then causing an exception due to unexpected garbage when reading the result info.

mkuemmel commented 2 years ago

It's just no clear why in my case it faded away, meaning did nothing after an initial processing. We'll need to do a 0.16 patch for Elements 6.0.1 and Alexandria 2.23 for the Euclid processing in Elements 3.0. We can correct it there as well.

mkuemmel commented 7 months ago

This version is no longer relevant and the latest version does not show this behaviour.