0 GetStatus on a single rank desynchronizes GRNG

heplesser commented 6 years ago

Nilton Kamiji reported this first on NEST User on 24 March 2018.

To reproduce, run the script

Rank 0 eq { 0 GetStatus } if
1 Simulate

with at least two MPI processes. It will fail with

Apr 25 08:19:50 SimulationManager::prepare [Error]: 
    Global Random Number Generators are not synchronized prior to simulation.

It passes when run with a single MPI process.

If both ranks individually run 0 GetStatus, the script passes:

Rank 0 eq { 0 GetStatus } if
Rank 1 eq { 0 GetStatus } if
1 Simulate

The problem is most likely that the 0 GetStatus somewhere reads a number from the GRNG, although I cannot see any reason why it should do so.

Note: One should never perform operations on a single or a subset of ranks, since this will quite likely upset NEST's parallelization logic. But a pure read operation such as GetStatus should be safe.

hakonsbm commented 6 years ago

This is not a problem with 0 GetStatus reading a number from GRNG, but rather a parallel computing problem.

When a script is run with for example two MPI processes, and call 0 GetStatus only on process 0, that process will at one point in accumulating the status dictionary want to update delay extrema, and thus has to communicate with other processes. This communication is done in the following line:

https://github.com/nest/nest-simulator/blob/79bbbb08d0ef209f0c8295339abc3c137a321137/nestkernel/mpi_manager.cpp#L665

This causes it to deadlock. However, the other process, process 1, will go into Simulate, and there it too will update delay extrema, resolving this deadlock. Then process 1 will continue to the check for synchronized GRNGs, where it will try to gather random numbers from all processes with MPI_Allgather, done in the following line:

https://github.com/nest/nest-simulator/blob/79bbbb08d0ef209f0c8295339abc3c137a321137/nestkernel/mpi_manager.cpp#L721

This creates a new deadlock.

Process 0 has at this point returned from the GetStatus call, goes into Simulate, and again tries to update delay extrema. In doing so, it will also call MPI_Allgather. This is what process 1 is waiting for, but it will not receive the random number from process 0. Rather, it receives the minimum delay, or possibly even a garbage number, which of course does not equal its own random number. Therefore process 1 throws the error.

heplesser commented 6 years ago

@hakonsbm Thank you for the detailed analysis!

I do not see any technical way in which we could guard against this type of problem: MPI requires by definition that all ranks stay in step, so that MPI communication operations on different ranks match each other. It is also impossible, in all generality, to detect that a user is performing a specific operation only on a single rank and prevent that directly. The check for GRNG synchrony is a relatively easy to implement at sensitive test for users trying to work around NEST's built-in parallelization.

The [documentation or Rank] explicitly states that [i]t is highly discouraged to use this function to write rank-dependent code in a simulation script as this can break NEST in funny ways of which dead-locks are the nicest.

I am therefore closing this issue as wontfix.

nest / nest-simulator

0 GetStatus on a single rank desynchronizes GRNG #937