espressomd / espresso

The ESPResSo package
https://espressomd.org
GNU General Public License v3.0
230 stars 187 forks source link

Virtual sites relative test case times out at random #4639

Open espresso-ci opened 1 year ago

espresso-ci commented 1 year ago

https://gitlab.icp.uni-stuttgart.de/espressomd/espresso/pipelines/19197

jngrad commented 1 year ago

The virtual_sites_relative.py test case times out at random. We recently increased the verbosity of the test to check which function was responsible for the time out, however it doesn't happen every time in the same function.

Here it happens after the test_pos_vel_forces teardown and before the test_vs_exceptions setup:

 82/186 Test  #94: virtual_sites_relative ........................................***Timeout 300.01 sec
test_aa_method_switching (__main__.VirtualSites) ... ok
test_lj (__main__.VirtualSites)
Run LJ fluid test for different cell systems. ... ok
test_pos_vel_forces (__main__.VirtualSites) ... ok

Here it happens right when test_lj starts, before the class can print the method docstring:

 81/185 Test  #94: virtual_sites_relative ........................................***Timeout 300.01 sec
test_aa_method_switching (__main__.VirtualSites) ... ok
test_lj (__main__.VirtualSites)
jngrad commented 1 year ago

New data point:

 82/185 Test  #94: virtual_sites_relative ........................................***Timeout 300.01 sec
test_aa_method_switching (__main__.VirtualSites) ... setUp(Fri Dec 23 20:01:26 2022) tearDown(Fri Dec 23 20:01:26 2022) ok
test_lj (__main__.VirtualSites)

The deadlock happens before entering the setUp method of the LJ test.

Also, the bug seems more likely to happen when both the default and maxset CI jobs are running simultaneously on the coyote11 runner.