Open aphearin opened 7 years ago
This issue will solve https://github.com/bccp/nbodykit/issues/502
I think a plausible easy way of doing this is to ensure each MPI rank contains a spatially localized domain -- then reuse the single node code on each rank.
Here is an object that helps you to distribute objects to domains:
https://github.com/rainwoodman/pmesh/blob/master/pmesh/domain.py#L274
And we were using it here:
https://github.com/bccp/nbodykit/blob/master/nbodykit/base/decomposed.py#L3
and here
https://github.com/bccp/nbodykit/blob/master/nbodykit/algorithms/pair_counters/domain.py#L113
You can probably write a better version of this on your own; or jump start your development with domain.py and _domain.pyx.
Those models that need particles may need to use the 'smoothing' argument of https://github.com/rainwoodman/pmesh/blob/master/pmesh/domain.py#L515
@rainwoodman - thanks a lot for the pointers. A spatial domain decomposition is indeed what I thought best for this problem. The only difference is that I have been using a buffer region around each domain the size of rmax, the largest pair-counting distance. It looks you handled this without this feature, but perhaps I read it too quickly?
I actually don't really know what Nick did for this. It smelled lile he did the same, but he could have been assuming the full catalog fits into a single rank here as well. It is just a few lines of changes though.
My bigger concern was the population mock part. What's your plan about this? Are there models that doesnt like this?
On Tue, Jul 3, 2018, 9:37 AM Andrew Hearin notifications@github.com wrote:
@rainwoodman https://github.com/rainwoodman - thanks a lot for the pointers. A spatial domain decomposition is indeed what I thought best for this problem. The only difference is that I have been using a buffer region around each domain the size of rmax, the largest pair-counting distance. It looks you handled this without this feature, but perhaps I read it too quickly?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/astropy/halotools/issues/825#issuecomment-402219168, or mute the thread https://github.com/notifications/unsubscribe-auth/AAIbTBMRbudgmFZzwwrGMRYD4fxGyF4jks5uC53UgaJpZM4QKA5W .
The mock-population part is truly trivially parallelizable - no models in the entire library would be impacted by the decomposition. However, the reason that this feature actually requires a very significant rewrite is that in order to fully take advantage of the parallelism, the summary statistics kernels need to be computed on the subvolumes, and then only the results are reported to rank 0 which collects things like subvolume pair-counts, sums then and converts them into the tpcf.
Implement a halo catalog that can be distributed across nodes of a large cluster using MPI