Open paulray opened 3 years ago
Hey @paulray, I will see if I can make those a bit faster by the end of this week, sorry they are so slow. I think I just grabbed a par and tim file that were already in accessible in the PINT repo and didn't look too much into the speed or number of TOAs.
I filed a bug #869 simultaneously with a load of suggestions for speeding up tests.
I can't speak for any of these specifically but many of our tests load a full .tim
file when five TOAs in a string would do, and would be clearer because the TOAs would be visible where the test is. Maybe I should see if I can supply better tools for this kind of thing?
Astropy has a mechanism where you can flag tests as "slow" and then they won't be run unless requested. Perhaps we could establish a workflow of "grab a slow test, write a fast one that you think tests the same thing, then flag the original one as slow and keep it around just in case"?
I think the slow flag is a good idea. A perfect example is "test_event_optimize", which is now the third slowest call. It runs a MCMC, but there's really no way to avoid this is you want to test an MCMC piece of code. But it's reasonable not to run it every time.
Practically I don't know how this would work with CI, though. Perhaps the slow tests get invoked manually, once a PR has "settled" and there are no new commits? And then we have locally a "make test" and a "make slow-test" or something. (Or maybe "make test" should run everything and "make fast-test" skips the slowies.)
That sounds like a great idea. Right now certain tests are skipped if the environment doesn't support them, so this could probably be easy to implement with an environment variable setting or something. RUN_SLOW_TESTS=1
It is given as an example in the pytest
documentation here: https://docs.pytest.org/en/stable/example/simple.html
You'd run it as pytest -m slow
, and we could set this up as appropriate in the Makefile and/or CI.
Astropy was, when we were using Travis, set up to run the slow tests no more than once a day, as a cron job. There are options.
Additional suggestions from #869 (so I can close it):
Currently the PINT test suite takes many minutes to run. While it is possible to run individual test files with pytest tests/test_thing.py
and individual tests with pytest tests/test_thing.py::test_thing_works
, running all the tests takes so long it disrupts development and discourages running the whole test suite.
There are tools for improving this:
hypothesis
make_fake_toas
can be used to generate small test setspytest --durations=N
to identify slow testsusepickle=True
so that loaded TOAs are cachedpytest.mark.parametrized
where this is excessiveRE fixtures: I think a huge amount of test time is being spent setting up a fit and testing something with it. (e.g., dmxparse). This could certainly be improved in many cases by using fake TOAs rather than large datasets. But grouping those tests where possible and having them use the same fit object would also make a huge improvement. Could this be done without losing clarity, i.e. by keeping the tests grouped into sensible files?
It can be done, with some care. The recommended way to handle this sort of thing is to create a function annotated as a fixture:
@pytest.fixture
def fitter_with_stuff():
m, t = get_model_and_toas(...)
f = WLSFitter(t, m)
f.fit_toas()
return f
Then this is used by individual tests by using its name as an argument:
def test_stuff(fitter_with_stuff):
assert fitter_with_stuff.stuff = "specific"
If you do this this way, fitter_with_stuff
will be re-run for each test that uses it. If you use @pytest.fixture(scope="module")
the fixture will be run only once for the whole module. (Expect pain and suffering if any test modifies the returned object.)
That would definitely be a step in the right direction!
I agree RE pain and suffering about modifying the object. What about pickling the Fitter after f.fit_toas, and the fixture returns "fitter_pickle_file" for the test to then load and unpickle? And perhaps some support functions to handle that loading and unpickling for specific flavors of fitter: load_cached_b1821_fitter, load_cached_j0030_wideband_fitter, etc.
On Wed, 3 Feb 2021 at 13:52, Anne Archibald notifications@github.com wrote:
It can be done, with some care. The recommended way to handle this sort of thing is to create a function annotated as a fixture:
@pytest.fixturedef fitter_with_stuff(): m, t = get_model_and_toas(...) f = WLSFitter(t, m) f.fit_toas() return f
Then this is used by individual tests by using its name as an argument:
def test_stuff(fitter_with_stuff): assert fitter_with_stuff.stuff = "specific"
If you do this this way, fitter_with_stuff will be re-run for each test that uses it. If you use @pytest.fixture(scope="module") the fixture will be run only once for the whole module. (Expect pain and suffering if any test modifies the returned object.)
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/nanograv/PINT/issues/870#issuecomment-772730285, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF3LAG5UUI7OI5HYPMA2VJLS5GK5LANCNFSM4UT4XGMA .
You can run the test suite as:
$ python -m cProfile -o profile $(which pytest)
and then the program snakeviz
(pip install snakeviz
) allows you to visualize the profiling results - you can figure out where PINT is spending all its time.
When using
make test
it now reports the slowest 20 tests. This is important since our CI resources are limited and it is taking a really long time for travis to run. So, we need to do some work to speed up the tests where possible.@bshapiroalbert can you see about the test_ftest ones? Those are the two slowest tests. If they can be reworked to use much smaller numbers of TOAs that would be great.