gandalfcode / gandalf

GANDALF (Graphical Astrophysics code for N-body Dynamics And Lagrangian Fluids)
GNU General Public License v2.0
44 stars 12 forks source link

MPI load balancing is still done via brute force #77

Closed giovanni-rosotti closed 7 years ago

giovanni-rosotti commented 7 years ago

After the new domains have been decided, particles need to be re-assigned to the correct domain. This is currently computed by the function FindParticlesToTransfer in HydroTree, called by LoadBalancing in MpiControl. However this function does it via brute-force (i.e. loops over every particle and for each particle loops over all the domains). It seems to me that the function using the tree is already there (FindMpiTransferParticles) in HydroTree, but it is never called. The interface is the same so it really looks like ready to be swapped in. What is going on? Did we just forget to call it, or did it have a fundamental problem?

giovanni-rosotti commented 7 years ago

The tree version of the routine looks fine to me and in the branch load_balancing I've switched to it. It's mostly fine, but there is an horrible edge case which breaks it. It happens when there is a particle exactly at a corner between domains, so the particle belongs to different domains. Previously we had a break in the loop over domains, so the particle was going to be sent only to one domain. But now we can no longer do that - how do we deal with this case? Should we set a rule that the particle in this case belongs to, say, the domain on the left? On a related note, how does the tree deal with this?

rbooth200 commented 7 years ago

I believe it's up to the tree to decide how it deals with these edge cases. They should be very rare in the Oct Tree, but with the KDTree the cell is split based on the number of particles in each cell, which means that if multiple particles fall on the boundary you can end up with some on one side and some the other...

I'm happy with any solution that breaks the degeneracy. Realistically this should be using a criterion like left < x <= right, or left <= x < right. I don't care which.

However, we might have to be careful now with the edges of the simbox. For open boundaries this should be well defined and for reflecting we should hopefully not have particles exactly on the boundary as this might cause problems. For periodic boundaries, I think we need to make sure that the box wrapping is done in exactly the same way

giovanni-rosotti commented 7 years ago

I agree, but I think I've stumbled upon a more foundamental problem with the simbox, namely that the MPI domains are not enforced to extend until the end of the simbox (see #117, but I din't realise it affects also periodic boundaries). There is a range between the left edge of the simbox and the leftmost particle which is not owned by any domain... I think the simulation will still run correctly, but clearly we do not want this.

giovanni-rosotti commented 7 years ago

So after having thought about it my understanding is that the KDTree doesn't care about these problems, because as you say it splits only based on the number of particles. The BoxOverlap function returns true even if the boxes just share an edge (i.e. the overlapping area is zero), so I think we're already conservative, even for the periodic case (which uses BoxOverlap under the hood). The OctTree doesn't work anyway...

I've modified ParticleInBox and I think it's now time that you give a look at it - I'll open a pull request. There is already a separate issue for the problems with simbox so we don't need to fix that here.

rbooth200 commented 7 years ago

Fixed in #121