Closed VictorForouhar closed 3 months ago
I have checked that the current implementation yields the correct host halo decisions in MPI and OMP runs, using the post-processing testing script.
The AssignHosts() function (in master, not just here) seems to be implemented by looping over all MPI ranks and doing a broadcast for each one to find all of the particles it needs. That might not scale so well. Given the time pressure maybe we don't want to change it unless it becomes a bottleneck, but if we do need to then the LocateValuesById() function from #22 could be used to import FOF membership information for the required particle IDs in a way that should scale better.
It will certainly be better if we have something that scales better. For my tests in L1000N0900, I/O is still the main time sink (from timings.log
):
77 77 141.588 0.000 11.938 5.138 7.178 1.156 0.932 2.893
Here are other timings from an older version (note first test used 24 MPI ranks and this one 20):
77 77 169.924 0.000 15.840 5.034 7.614 0.496 0.953 3.484
It looks like this sets HostHaloId=-1 for zero particle objects. We'll need to do something about that once #23 is merged, but maybe that can be a separate pull request.
I already have a workaround for that in the new branch: https://github.com/SWIFTSIM/HBTplus/blob/182e11dabcbae5ecece4ed6802cb8e1f6907771c/src/subhalo_tracking.cpp#L190 If we have a zero-sized subhalo, it will fill in the first entry of TracerParticleIds with the value of sub.MostBoundParticleId. In my ongoing tests, it seems to work.
New method of tracing where subhaloes end up, which uses more than one tracer particle. Intended to solve issues relating to duplicate/masked-out subhaloes, as they are largely caused by (individual) most bound particles becoming hostless.
The merit function is based on GADGET-4, whereby we score each candidate FOF by doing a weighted sum of the most bound particle's FoF hosts. This can still result in hostless haloes, if most of the subhalo core ends up away from a FOF.