The data loader functions gmx_benzene_dHdl() and gmx_benzene_u_nk() are called at the time when pytest collects tests. This slows down the test setup phase substantially (it's done in serial) and it also has the potential to fill up memory.
Solution
Make sure that the loader function is only evaluated inside the test.
Problem
Our tests contain examples of parametrizing over different datasets by calling the data loader function in the
pytest.mark.parametrize
such as https://github.com/alchemistry/alchemlyb/blob/72622c99c16c74efd1f8517cb15ef21e16d6ebda/src/alchemlyb/tests/test_preprocessing.py#L66-L69The data loader functions
gmx_benzene_dHdl()
andgmx_benzene_u_nk()
are called at the time when pytest collects tests. This slows down the test setup phase substantially (it's done in serial) and it also has the potential to fill up memory.Solution
Make sure that the loader function is only evaluated inside the test.
Use pytest's getfixturevalue
Use
request.getfixturevalue(fixture_name)
to dynamically run the fixture function namedfixture_name
Should probably look like
Note that in the example above, the
dataloader
is the name of a pytest fixture and not an ordinary function (as currently implemented in our tests).Current hacky solution in alchemlyb
Instead, only pass the function objects to a parametrized fixture and then evaluate inside the parametrized fixture itself, as shown, for example in https://github.com/alchemistry/alchemlyb/blob/72622c99c16c74efd1f8517cb15ef21e16d6ebda/src/alchemlyb/tests/test_fep_estimators.py#L151-L168 This approach ensures that data is loaded when needed and can be done in parallel.
TODO