The simulations freeze when the number of neuron in a population is lower than the number of MPI processes

RCagnol commented 1 year ago

I noticed that in PyNN 0.10.1 the simulations freeze when the number of MPI node exceedes the number of neurons in a population. This wasn't happening in PyNN 0.10.0.

For example the following code freezes when the number of MPI process is higher than 2:

import pyNN.nest as sim
n_neurons = 2
node_id = sim.setup(timestep=0.1, min_delay=0.1, max_delay=100)
neurons = sim.Population(n_neurons, sim.IF_cond_exp())
neurons.record(['spikes','v', 'gsyn_exc','gsyn_inh'])
sim.run(100.0)
block = neurons.get_data(['spikes', 'v', 'gsyn_exc', 'gsyn_inh'],clear=True)

The MPI nodes to which no neurons are assigned seems to reach the call to gather here: https://github.com/NeuralEnsemble/PyNN/blob/master/pyNN/recording/__init__.py#L73

While the MPI nodes to which some neurons are assigned seem to freeze before that, when trying to access to self._simulator.state.t in different parts of the _get_current_segment() method ( https://github.com/NeuralEnsemble/PyNN/blob/master/pyNN/recording/__init__.py#L269 )

I didn't manage to figure out what is exactly causing that, but it seems that in the same method, displacing the following lines:

t_start = self._recording_start_time
t_stop = self._simulator.state.t * pq.ms
sampling_period = self.sampling_interval * pq.ms
current_time = self._simulator.state.t * pq.ms

before the if signal_array.size > 0: condition (which was the case in PyNN 0.10.0) seems somehow to solve the issue.

apdavison commented 1 year ago

probably related to #657

RCagnol commented 1 year ago

It seems that the issue is caused by the fact that only some of the MPI processes try to access to self._simulator.state.t. Indeed, it seems that Nest freezes when only some MPI nodes access to nest.biological_time.

For example, the following nest code freezes when ran with mpirun -n 2

from mpi4py import MPI
mpi_comm = MPI.COMM_WORLD
import nest
if mpi_comm.rank == 0:
    nest_time = nest.biological_time
    print(nest_time)

NeuralEnsemble / PyNN

The simulations freeze when the number of neuron in a population is lower than the number of MPI processes #767