Preserve the order in the optimized_grouping branch

astrorama / SourceXtractorPlusPlus

SourceXtractor++, the next generation SExtractor

https://astrorama.github.io/SourceXtractorPlusPlus/

GNU Lesser General Public License v3.0

72 stars 9 forks source link

Preserve the order in the optimized_grouping branch #582

Closed mkuemmel closed 1 month ago

mkuemmel commented 1 month ago

The optimized grouping branch (https://github.com/astrorama/SourceXtractorPlusPlus/tree/feature/optimized_grouping) speeds things really up.

On the other hand it turns out that the SE++ performance improves if the order of the processing corresponds to an order of the objects on the image (better re-use of the image tiles in RAM). The optimized grouping doe not preserve any existing order, which again eats up some of the performance improvements.

It would be great if the order could be preserved (of course including the grouping).

marcschefer commented 1 month ago

@mkuemmel

Hi Martin, are you sure this is the case?

I think I was a bit confused during the meeting last week, I said we were using a hash table but I'm actually using the C++ std::map which is of course not a hash table but a tree structure that preserves the order. So as far as I can tell, with assoc catalog as input it should be output in the order of the group_id.

Split grouping should also keep the order and Moffat grouping is more complex as groups might be merged but overall if that's not the case the order should also stay the same.

mkuemmel commented 1 month ago

No, I am not sure. I think I had some hints that it slows down, but I don't remember anymore on what this is based. An you "admitted" it last week :-)

I can generate a test...

mkuemmel commented 1 month ago

To conclude this, it turns out that SE++ with the optimized_grouping sends objects to the fitting in the order of the content of the column passed via --assoc-group-id. So if the numbering in this column is along the desired ordering such as RA or DEC then the ordering is preserved.

mkuemmel commented 1 month ago

@marcschefer do you think the optimized grouping can be merged?

marcschefer commented 1 month ago

Yes, it seems we didn't find any major problem so if you agree then we can merge.

mkuemmel commented 1 month ago

At the end of the day there is even an advantage when the grouping column is used to feed the fitting. That can be used to control this process such as mixing large groups well into the individual objects.

mkuemmel commented 1 month ago

Could you already merge the branch @marcschefer?

Closing this one here.