MDAnalysis / mdanalysis

MDAnalysis is a Python library to analyze molecular dynamics simulations.
https://mdanalysis.org
Other
1.29k stars 647 forks source link

Benchmark for the RMS module fail #3520

Open jbarnoud opened 2 years ago

jbarnoud commented 2 years ago

Expected behavior

Benchmarks run without errors.

Actual behavior

The benchmarks for analysis.rms fail with the following exceptions:

For parameters: 100, [1.0, 0.5], False, False
Traceback (most recent call last):
File "/home/benchy/mambaforge/lib/python3.9/site-packages/asv/benchmark.py", line 1184, in main_run_server
 main_run(run_args)
File "/home/benchy/mambaforge/lib/python3.9/site-packages/asv/benchmark.py", line 1058, in main_run
 result = benchmark.do_run()
File "/home/benchy/mambaforge/lib/python3.9/site-packages/asv/benchmark.py", line 537, in do_run
 return self.run(*self._current_params)
File "/home/benchy/mambaforge/lib/python3.9/site-packages/asv/benchmark.py", line 627, in run
 samples, number = self.benchmark_timing(timer, min_repeat, max_repeat,
File "/home/benchy/mambaforge/lib/python3.9/site-packages/asv/benchmark.py", line 694, in benchmark_timing
 timing = timer.timeit(number)
File "/home/benchy/mambaforge/lib/python3.9/timeit.py", line 177, in timeit
 timing = self.inner(it, self.timer)
File "<timeit-src>", line 6, in inner
File "/home/benchy/mambaforge/lib/python3.9/site-packages/asv/benchmark.py", line 599, in <lambda>
 func = lambda: self.func(*param)
File "/home/benchy/mdanalysis/benchmarks/benchmarks/analysis/rms.py", line 46, in time_rmsd
 rms.rmsd(a=self.A,
File "/home/benchy/mambaforge/lib/python3.9/site-packages/MDAnalysis/analysis/rms.py", line 260, in rmsd
 raise ValueError('weights must have same length as a and b')
ValueError: weights must have same length as a and b

For parameters: 100, [1.0, 0.5], False, True
Traceback (most recent call last):
File "/home/benchy/mambaforge/lib/python3.9/site-packages/asv/benchmark.py", line 1184, in main_run_server
 main_run(run_args)
File "/home/benchy/mambaforge/lib/python3.9/site-packages/asv/benchmark.py", line 1058, in main_run
 result = benchmark.do_run()
File "/home/benchy/mambaforge/lib/python3.9/site-packages/asv/benchmark.py", line 537, in do_run
 return self.run(*self._current_params)
File "/home/benchy/mambaforge/lib/python3.9/site-packages/asv/benchmark.py", line 627, in run
 samples, number = self.benchmark_timing(timer, min_repeat, max_repeat,
File "/home/benchy/mambaforge/lib/python3.9/site-packages/asv/benchmark.py", line 694, in benchmark_timing
 timing = timer.timeit(number)
File "/home/benchy/mambaforge/lib/python3.9/timeit.py", line 177, in timeit
 timing = self.inner(it, self.timer)
File "<timeit-src>", line 6, in inner
File "/home/benchy/mambaforge/lib/python3.9/site-packages/asv/benchmark.py", line 599, in <lambda>
 func = lambda: self.func(*param)
File "/home/benchy/mdanalysis/benchmarks/benchmarks/analysis/rms.py", line 46, in time_rmsd
 rms.rmsd(a=self.A,
File "/home/benchy/mambaforge/lib/python3.9/site-packages/MDAnalysis/analysis/rms.py", line 255, in rmsd
 a = a - np.average(a, axis=0, weights=weights)
File "<__array_function__ internals>", line 180, in average
File "/home/benchy/mambaforge/lib/python3.9/site-packages/numpy/lib/function_base.py", line 515, in average
 raise ValueError(
ValueError: Length of weights not compatible with specified axis.
AnirG commented 2 years ago

Hey @jbarnoud! Weighted RMSD calculations require weight values for all atoms defined in the list, the way it is implemented. But in the benchmarking test, the parameter of the weights list hasn't been defined correctly corresponding to the number of atoms (contains only 2 values). wrong_weights_list Suggestions:

  1. Don't benchmark with weighted RMSD, as functionally it doesn't affect the benchmark much.
  2. Use a random list that adds up to one which contains num_atoms elements for corresponding num_atoms. Please guide me as I'm new to this stuff. I can take up this issue.
orbeckst commented 2 years ago

For https://github.com/MDAnalysis/mdanalysis/blob/55fa72f34f50294f088fe0a43b75eda792d3d366/benchmarks/benchmarks/analysis/rms.py#L17-L24 I suggest we change weights to be either True or None. We then let setup() https://github.com/MDAnalysis/mdanalysis/blob/55fa72f34f50294f088fe0a43b75eda792d3d366/benchmarks/benchmarks/analysis/rms.py#L26 either set self.weights to None or, for weights==True, to the weights computed from the masses according to the relative weight formula in the rmsd() docs

self.weights = self.A.masses() / np.sum(self.A.masses())

Then the actual call to https://github.com/MDAnalysis/mdanalysis/blob/55fa72f34f50294f088fe0a43b75eda792d3d366/benchmarks/benchmarks/analysis/rms.py#L46-L50 in time_rmsd() can be simply

        rms.rmsd(a=self.A,
                 b=self.B,
                 weights=self.weights,
                 center=center,
                 superposition=superposition)