pgrete commented 6 months ago

When and where should I call GetNeighborBlockIndex in downstream codes?

The function internally updates the blockIndex_(n) so I'm wondering if the Parthenon comm machinery assumes that this index is always correct or if it's updated automatically before doing comm.

In the former case, would it be worth to hide that machinery from downstream codes?

ping @brryan @Yurlungur @AstroBarker

brryan commented 6 months ago

GetNeighborBlockIndex needs to be called in downstream codes: 1) Before particle communication, because like you say it updates blockIndex_(n) which is used to determine if, and if so to which neighbor meshblock, particles are sent. 2) Before iteratively updating particles, because you aren't allowed to move particles beyond a neighbor mesh block. For example, with a loop

par_for() {
  while (particle_t(n) < t + dt) {
    particle_x(n) += dx_cell;
    particle_t(n) += dx_cell / speed;

    bool on_current_mesh_block = true;
    swarm_d.GetNeighborMeshBlock(x, y, z, on_current_mesh_block);
    if (!on_current_mesh_block) {
      // Stop transporting until this particle is communicated in a subsequent kernel
      break;
    }
  }
}

We need to check each iterative update of the particle position to see if we have left the meshblock, otherwise the particle could move many meshblocks before it gets to t + dt and particles only support communication to neighbors. Similar issue for entering a physical boundary condition. Since we already need to do this here, blockIndex_ is not separately updated before communication.

I'm not very happy with the need for downstream codes to explicitly update an internal variable blockIndex_ this way, but I don't have a better solution. One could maybe somehow protect particle x, y, z variables and provide a function UpdatePosition(dx, dy, dz) that internally also updates blockIndex_ but this might be effectively doing the same thing. I'm happy to discuss if you have thoughts on improving this.

pgrete commented 6 months ago

I see. Thanks for the info.

In that case it might be worth to expose a task that does specifically that (so that the interface is clear) or a combined task that first updates the info and then sends (similar to the higher level tasks for the other boundary comm).

Right now (from existing code) I got the impression that is has to be called at the end of certain kernels and only inferred the relevance to communication.

brryan commented 5 months ago

For my recent PR #1026 I am moving boundary communication for swarms towards the patterns we are using for fields. It seems likely we will end up with a AddSwarmBoundaryExchangeTasks that could default to checking particles for which meshblock they currently belong to (active or ghost neighbors) which would remove the need for checking on_current_mesh_block for applications that don't move particles more than one cell between communication steps (unlike my example above).

AddSwarmBoundaryExchangeTasks could also optionally not perform that check to avoid redundant work for applications that already use GetNeighborMeshBlock for checking whether to stop iterative particle updates between communication.

I'm hesitant to remove GetNeighborMeshBlock entirely because it seems reasonable to me for parthenon to provide the functionality to determine whether a particle is on a given meshblock. But my above suggestion would get to a point where we don't need that call for every kind of particle implementation.

Does that seem like a good solution to you (I think it's exactly your second suggestion but I wanted to confirm)?

pgrete commented 5 months ago

I'm happy confirm that this kind of solution is what I had in mind.

brryan commented 4 months ago

I thought about this and convinced myself it is fine to do for particle implementations where you aren't iterating over particle pushes during a single timestep, such that particles can't possibly move further than one neighboring meshblock before the send/receive communication is applied. If particles are moving just once per timestep it's fine to call GetNeighborBlockIndex under the hood, and if particles are moving many cells per timestep they will presumably have called GetNeighborBlockIndex many times already to check their positions, so one more call shouldn't be a big deal.

1026 addresses this.

No more required GetNeighborBlockIndex in particle leapfrog example

The extra kernel launch where GetNeighborBlockIndex is called under the hood -- hopefully this can be merged with the subsequent work once I calculate that on device with prefix sums.

parthenon-hpc-lab / parthenon

Use of `GetNeighborBlockIndex` in downstream codes #1013

1026 addresses this.