SciPy-based kd-tree with periodic boundary conditions
Copyright 2012 Patrick Varilly. Released under the scipy license
When running and/or analyzing molecular simulations, finding the closest neighbors of a particles is a very common operation. Usually, Verlet lists and/or cell lists are used for this, but these usually work well when the maximum neighbor distance is fixed and not too large. kd-trees are the general answer to the problem of finding nearest neighbors, and SciPy implements kd-trees quite well. However, molecular simulations are usually run in periodic boxes, not open boxes, so the act of finding "close" neighbors must take the periodic boundary conditions into account.
This module implements two classes, PeriodicKDTree and PeriodicCKDTree, that work as drop-in replacements for scipy.spatial.KDTree and scipy.spatial.cKDTree, respectively (the latter of these is a Cython-optimized version of a kd-tree but with more limited functionality).
For the moment, only the query and query_ball_point methods are implemented. These should see you through most situations in analyzing molecular simulations.
The uwham package relies on NumPy and SciPy to do its work. Make sure these are installed (for Mac OS X users, see note below). Then run the following command inside the uwham directory:
python setup.py install --user
The query_ball_point method in PeriodicCKDTree relies on the extension to cKDTree I recently wrote for SciPy. See https://github.com/scipy/scipy/pull/262
from periodic_kdtree import PeriodicCKDTree import numpy as np
bounds = np.array([30.0, 30.0, -1]) # xy periodic, open along z
n = 10000 data = 30.0 * np.random.randn(n, 3)
T = PeriodicCKDTree(bounds, data)
d, i = T.query([45.0, 10.0, 10.0], k=4)
neighbors = T.query_ball_point([45.0, 10.0, 10.0], r=3.0)
See test_periodic_kdtree.py, benchmark.py and nonperiodic_benchmark.py (based off of Anne M. Archibald's benchmarks)
Simple queries seem to be substantially slower for now, but ball lookups aren't that unacceptably slower.
Sample periodic benchmarks (time in seconds) ++++++++++++++++++++++++++++++++++++++++++++
dimension 3, 10000 points PeriodicKDTree constructed: 0.0793428 PeriodicCKDTree constructed: 0.00343394 PeriodicKDTree 1000 lookups: 14.8779 PeriodicCKDTree 1000 lookups: 1.73712 flat PeriodicCKDTree 1000 lookups: 14.2604 PeriodicKDTree 1000 ball lookups: 39.7997 PeriodicCKDTree 1000 ball lookups: 0.222894 flat PeriodicCKDTree 1000 ball lookups: 1.23763 Ball lookups agree? True
Sample nonperiodic benchmarks +++++++++++++++++++++++++++++ dimension 3, 10000 points KDTree constructed: 0.0820239 cKDTree constructed: 0.00158691 KDTree 1000 lookups: 0.561787 cKDTree 1000 lookups: 0.00447702 flat cKDTree 1000 lookups: 0.335597 KDTree 1000 ball lookups: 5.60061 cKDTree 1000 ball lookups: 0.0196991 flat cKDTree 1000 ball lookups: 0.336687 Ball lookups agree? True
In Mac OS X, the usual automatic downloading of dependencies during a python setup.py install isn't able to successfully install NumPy and SciPy (in my machine, it's the lack of a Fortran compiler, but the SciPy docs point to other potential problems). So you have to download and install NumPy and SciPy manually.
The good news: binaries are easily available The bad news: they only work with the version of Python from http://www.python.org, not the version that ships with OS X!
So you have to download and install three things:
This is far less painful than it sounds. In my own case, the disk images that I ended up downloading were:
CAREFUL: for some of these packages, the "link to the latest version" that SourceForge suggests may be incorrect! Do look at the full list of downloads available and pick the one that is most appropriate to your own setup
Once you've installed python2.7, numpy and scipy, you can run the command
python setup.py install --user