rs-station / reciprocalspaceship

Tools for exploring reciprocal space
https://rs-station.github.io/reciprocalspaceship/
MIT License
28 stars 12 forks source link

Speed up `DataSet.expand_to_p1()` by removing for loop #152

Closed JBGreisman closed 2 years ago

JBGreisman commented 2 years ago

The slowest tests in our CI suite are within test_dataset_grid.py, and they end up being due to calls to DataSet.expand_to_p1(). Using pytest with --durations=5, we can see that the worst offenders in that file are:

6.24s call     test_dataset_grid.py::test_to_reciprocalgrid_complex[/github/reciprocalspaceship/tests/data/fmodel/4CY9.mtz-False-5]
6.02s call     test_dataset_grid.py::test_to_reciprocalgrid_complex[/github/reciprocalspaceship/tests/data/fmodel/4CY9.mtz-False-3]
5.98s call     test_dataset_grid.py::test_to_reciprocalgrid_complex[/github/reciprocalspaceship/tests/data/fmodel/4CY9.mtz-False-2.5]
5.80s call     test_dataset_grid.py::test_to_reciprocalgrid_float[/github/reciprocalspaceship/tests/data/fmodel/4CY9.mtz-False-5]
5.77s call     test_dataset_grid.py::test_to_reciprocalgrid_complex[/github/reciprocalspaceship/tests/data/fmodel/6I9P.mtz-False-5]

These particular tests involve high-symmetry spacegroups, which are worst-case scenarios for expand_to_p1() due to its for-loop over symmetry operations.

This PR removes that for loop, instead opting for a single call to DataSet.hkl_to_observed() within DataSet.expand_to_p1(). This speeds up the tests by ~10x -- here are the new slowest tests in that file:

0.68s call     test_dataset_grid.py::test_to_reciprocalgrid_complex[/github/reciprocalspaceship/tests/data/fmodel/1OEL.mtz-True-5]
0.64s call     test_dataset_grid.py::test_to_reciprocalgrid_complex[/github/reciprocalspaceship/tests/data/fmodel/4CY9.mtz-False-5]
0.60s call     test_dataset_grid.py::test_to_reciprocalgrid_complex[/github/reciprocalspaceship/tests/data/fmodel/1OEL.mtz-False-5]
0.58s call     test_dataset_grid.py::test_to_reciprocalgrid_complex[/github/reciprocalspaceship/tests/data/fmodel/1OEL.mtz-False-2.5]
0.57s call     test_dataset_grid.py::test_to_reciprocalgrid_complex[/github/reciprocalspaceship/tests/data/fmodel/4I6Y.mtz-False-5]