MDAnalysis / mdanalysis

MDAnalysis is a Python library to analyze molecular dynamics simulations.
https://mdanalysis.org
Other
1.32k stars 652 forks source link

HydrogenBondAnalysis log warn test fails with more than 4 workers #3543

Open jbarnoud opened 2 years ago

jbarnoud commented 2 years ago

Expected behavior

All tests pass on all supported platforms.

Actual behavior

The CI cron job for the ARM platform fails with the following error:

____________ TestHydrogenBondAnalysisIdeal.test_logging_step_not_1 _____________

[gw5] linux -- Python 3.8.3 /home/travis/virtualenv/python3.8.3/bin/python

self = <MDAnalysisTests.analysis.test_hydrogenbonds_analysis.TestHydrogenBondAnalysisIdeal object at 0xffff83564820>

universe = <Universe with 6 atoms>

caplog = <_pytest.logging.LogCaptureFixture object at 0xffff8347d160>

    def test_logging_step_not_1(self, universe, caplog):

        hbonds = HydrogenBondAnalysis(universe, **self.kwargs)

        # using step 2

        hbonds.run(step=2)

        caplog.set_level(logging.WARNING)

        hbonds.lifetime(tau_max=2, intermittency=1)

        warning = \

            "Autocorrelation: Hydrogen bonds were computed with step > 1."

>       assert any(warning in rec.getMessage() for rec in caplog.records)

E       assert False

E        +  where False = any(<generator object TestHydrogenBondAnalysisIdeal.test_logging_step_not_1.<locals>.<genexpr> at 0xffff85d8cf20>)

/home/travis/build/MDAnalysis/mdanalysis/testsuite/MDAnalysisTests/analysis/test_hydrogenbonds_analysis.py:251: AssertionError

See the full report on https://app.travis-ci.com/github/MDAnalysis/mdanalysis/jobs/560854076

Current version of MDAnalysis

https://app.travis-ci.com/github/MDAnalysis/mdanalysis/jobs/560854076

IAlibay commented 2 years ago

🤔 interesting - I know the armv8 runner tends to error out but it's the first time I've seen a proper failure

jbarnoud commented 2 years ago

It did error earlier today and stalled, making it impossible to know what test failed. I restarted it and it went through with the error reported above.

It seems that the job failed and stalled already last week. Before that it would sometimes stall but not fail.

IAlibay commented 2 years ago

Ah fair I did just restarted CI just to check how reproducible this is. The runner stalling is something that's a bit out of our hands, it's the only place we can get that runner, and it's not very stable :(

Is there a reason we use logging in the HBA code? Especially for that code path we don't even bother to throw a more traditional warning and instead opt for logging.warning.

IAlibay commented 2 years ago

Ok yeah - so now I know why this particular test is so familiar.

Turns out I also encountered this last week when doing something else but couldn't narrow down the issue (particularly since @fiona-naughton couldn't reproduce it if I remember correctly).

The issue comes when calling pytest with 8 threads (not tried more) at the top of MDAnalysisTests (in the same way that the arm64 runner does).

It's some kind of race condition, but I can't work out what at this moment, possibly the fixture scope? It might be releated to: https://github.com/pytest-dev/pytest-xdist/issues/402

IAlibay commented 2 years ago

I have updated the title of this issue to reflect the current situations - this is a blanket failure across multiple arch and OS types. Pretty much > 4 workers will trigger this failure.

Question - are we ok just switching out of logging.warn to just a direct warning instead here? We don't use logger stuff much in MDA and it seems somewhat out of place.

IAlibay commented 2 years ago

Weirdly enough the ARMv8 runner isn't failing anymore (as far as I can tell), but I can still replicate this issue locally. Would be good to know if anyone else gets this on a fresh install.