dbbs-lab / bsb-core

The Brain Scaffold Builder
https://bsb.readthedocs.io
GNU General Public License v3.0
21 stars 16 forks source link

Stochastic duplication of VoxelIntersection data, on GitHub Actions #820

Closed Helveg closed 2 months ago

Helveg commented 5 months ago

I can't reproduce this locally, but this error started appearing during the TTY work:

======================================================================
FAIL: test_single_voxel (test_connectivity.TestVoxelIntersection)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/work/bsb-core/bsb-core/tests/test_connectivity.py", line 562, in test_single_voxel
    self.assertEqual(2, len(pre_locs), "expected 2 connections")
AssertionError: 2 != 4 : expected 2 connections

----------------------------------------------------------------------

Similarly, sometimes the test_cells_placed places 39 instead of 40 cells.

Helveg commented 5 months ago

Similar one, throwing it in here:

======================================================================
FAIL: test_multi_indegree (test_connectivity.TestFixedIndegree)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/work/bsb-core/bsb-core/tests/test_connectivity.py", line 694, in test_multi_indegree
    self.assertTrue(np.all(total == 50), "Not all cells have indegree 50")
AssertionError: False is not true : Not all cells have indegree 50

----------------------------------------------------------------------
Ran 374 tests in 30.225s

FAILED (failures=1, skipped=21)
ok
test_unequal_len (test_voxels.TestVoxelSet) ... ok
test_volume (test_voxels.TestVoxelSet) ... ok
test_weird_usage (test_voxels.TestVoxelSet) ... ok

======================================================================
FAIL: test_multi_indegree (test_connectivity.TestFixedIndegree)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/work/bsb-core/bsb-core/tests/test_connectivity.py", line 694, in test_multi_indegree
    self.assertTrue(np.all(total == 50), "Not all cells have indegree 50")
AssertionError: False is not true : Not all cells have indegree 50

----------------------------------------------------------------------
Ran 374 tests in 30.493s

FAILED (failures=1, skipped=21)
--------------------------------------------------------------------------

they all seem to be dealing with some sort of floating point error perhaps?

drodarie commented 3 months ago

I tried to force the dtype of the numpy arrays to be int in test_multi_indegree but that did not seem to resolve the issue... I would have to dig more on this.

drodarie commented 3 months ago

It seems that the issue is only happening with MPI. I managed to replicate the issue locally with that command (it is quite random however):

mpiexec -n 2 python -m unittest test_connectivity.TestFixedIndegree.test_multi_indegree

In this scenario, it seems that the jobs of multi_indegree are duplicated for some reason. I could test that by printing some logs within the connect function: this function is called twice for each postsynaptic chunk for this strategy. Surprisingly, the indegree strategy jobs does not seem to be duplicated.

I also tried to write in logs the jobs that are put in the pool for the multi_indegree strategy in InvertedRoI.queue but doing so seems to fix the issue... Could it be some synchronization issue?