SWIFTSIM / HBTplus

HBTplus halo finder adapted for the FLAMINGO and COLIBRE simulations
0 stars 0 forks source link

Output descendant information for halos which become unresolved without sinking #22

Closed jchelly closed 2 months ago

jchelly commented 3 months ago

It's possible for a halo track to fall below the minimum number of bound particles without ever satisfying the condition to have merged with another track. For building merger trees we need to identify a descendant for these objects. Simply assuming that they merge with their parent halo doesn't seem to be reliable.

This pull request modifies the code to compute descendants for these objects as follows:

With this change all tracks which become orphans should have a descendant listed in either SinkTrackId or DisruptTrackId.

jchelly commented 3 months ago

A possible problem with this implementation: I'm assuming that particles are in binding energy order just prior to unbinding. Might not be true, for centrals at least, in which case I need to identify the tracer IDs at an earlier point in the code.

jchelly commented 3 months ago

This will now store tracer IDs for all subhalos. The IDs are stored just after AssignHosts, so subhalos should have been moved to their final MPI rank but the particle ordering from the previous snapshot has not been changed yet.

jchelly commented 3 months ago

This should be ready to test now. For all subhalos which were resolved at the previous snapshot it outputs a DescendantTrackId which indicates which resolved subhalo in the current snapshot received the largest number of tracer particles.

jchelly commented 3 months ago

Initial checks on the small COLIBRE test look ok: of subhalos which remain resolved between two snapshots the DescendantTrackId points at the same TrackId in 99.8% of cases. Of those which sink at the later snapshot, DescendantTrackId==SinkTrackId in 99.2% of cases.

jchelly commented 3 months ago

I've done some checks on L1000N0900/DMO_FIDUCIAL comparing between the DescendantTrackId output by this branch and computing descendants in postprocessing in python. In most cases the results agree:

Fraction of mismatches among all bound objects = 0.0003962594579342005
Fraction of mismatches among objects which became unresolved = 0.0016803078757656742
Fraction of mismatches among objects which became unresolved but didn't sink = 0.001975717791020015

So about 1 in 500 objects which become unresolved without sinking (the case where this output is useful) differ.

In the python code I'm using the read routine from SOAP which doesn't allow one particle to belong to multiple halos. I think that might account for some of the differences. It will also behave differently when the tracer particles are evenly split between two possible descendants.

jchelly commented 3 months ago

I've a written a python script to reproduce the descendant identification exactly in the DMO case. It takes the first N most bound particles from each subhalo in one snapshot and identifies the subhalo in the next snapshot which contains the largest number of those particles. Ties are broken by taking the lower TrackId and orphans are ignored.

I've run this test on all of the snapshots of L1000N0900/DMO_FIDUCIAL and the output from the python script exactly matches the output from this branch.

jchelly commented 3 months ago

I think this is in a state where I'd be prepared to try to use it on L2800N10080/DMO_FIDUCIAL. If it's too slow or memory hungry we can comment out the new function calls and try to generate the descendant information in postprocessing.

jchelly commented 2 months ago

Here are timings (in seconds) for various parts of the calculation in L1000N1800/DMO_FIDUCIAL at snapshot 71:

snap_io=30.028
snapshot_exchange=36.287
snapshot_hash=6.232
halo_io=27.936
halo_comms=15.561
update_halo_particles=0.001
update_subhalo_particles=28.476
assign_hosts=13.960
unbind=69.506
merge=2.239
update_tracks=2.755
mergertree=6.860
write_subhalos=36.124

So the cost of the merger tree calculation (~7 seconds) is not negligible but it's a small fraction of the total.

VictorForouhar commented 2 months ago

All looks good now!