openmm / pdbfixer

PDBFixer fixes problems in PDB files
Other
464 stars 114 forks source link

Optimizations to _findNearestDistance() #268

Closed peastman closed 1 year ago

peastman commented 1 year ago

Fixes #264. This uses the _CellList class to find the neighbors of an atom in constant time. When working with large models, this produces a major speedup.

I also made one other change. It stops looping once the nearest pair of atoms is at least 0.13 nm apart. It ignored pairs of atoms within the same residue, but it didn't ignore pairs in different residues that were bonded to each other (e.g. peptide bonds). Since those may reasonably be close together, that could lead it to perform more iterations than necessary. It now ignores all bonded pairs when finding the nearest distance.

edwag commented 1 year ago

This code looks good to me, and behaviour seems correct. It's a little awkward for me to run a controlled benchmark for my use case, but the performance is certainly much faster than the old implementation, by at least an order of magnitude.

I have a couple of thoughts on readability/style which I will leave as comments on the diffs.

peastman commented 1 year ago

Thanks! I pushed changes based on your suggestions.