Open GoogleCodeExporter opened 9 years ago
I'll have a go at this. The github link looks good, but the benchmarks for
periodic compared to regular seem slower than they should be, so I might look
at extending the current solution instead.
With the above github link, it says it's licensed under a scipy license. Am I
allowed to copy that solution into MDA and reference it?
Original comment by richardjgowers
on 3 Feb 2014 at 9:24
The scipy license http://www.scipy.org/scipylib/license.html is a BSD-3-clause,
which is compatible with GPL2
http://www.gnu.org/licenses/license-list.html#GPLCompatibleLicenses . We just
have to properly attribute an include the licence file.
Original comment by orbeckst
on 3 Feb 2014 at 9:38
Original comment by orbeckst
on 4 Feb 2014 at 12:15
I've got an experimental branch up for this, feature-periodicKDTree
It's a fairly crude solution, if a box is also given to the function, it copies
the coordinates 1/2 box length in all directions, so you make a KDTree for a
box 8x the size. (Looking any further than 1/2 box length is pointless as you
will have a nearer image in the opposite direction). I keep track of which
coordinates I copied where for later...
You do the regular KD search on the augmented list of coordinates, then when it
get the indices back, if any are higher than the number of coordinates (because
they come from a "false" coordinate"), it translates these back to their proper
index.
Because I'm just making an oversized KDTree, the scaling isn't amazing, it
takes me about 12x as long to build a tree with full periodic boundaries. From
timing I did myself, distance_array is quicker unless you start doing lots of
searches. I was using a system of about 10k particles, maybe it will be better
for larger systems.
To improve on the performance, I added a width keyword. This lets you choose
how much periodic space to consider, in each dimension. (width = np.array([+x,
-x, +y, -y, +z, -z]) )
Eg if you were doing something with a bilayer you could choose to only extend
the KDTree into the Z coordinate, meaning that you would only then have ~2x the
amount of coordinates rather than 8x.
The width keyword works according to distance too, so you can specify that you
want 5 Angstroms in the +Z and -Z direction. Again, I think this would be good
for larger systems, and when you know the length of your search radius when
setting up the Tree.
If someone could do some performance testing with an obnoxiously large system,
I'd be interested to see if this is useful!
I did also find this solution which uses the same base KDTree as MDA.
http://prody.csb.pitt.edu/_modules/prody/kdtree/kdtree.html#KDTree
This method seems to not duplicate the coordinates, but instead run each search
27 times (moving reference coords each time), and pick the minimum result. I'd
think this would be much faster to build, but then possibly slower to use.
Original comment by richardjgowers
on 8 Feb 2014 at 2:44
The test trajectory MDAnalysis.tests.datafiles.GRO and
MDAnalysis.tests.datafiles.XTC contain 47681 atoms. Try those for a start.
An alternative to KDTree would be to just do a grid search using minimum image.
There's actually some code in MDAnalysis.analysis.gnm.generate_grid() that
demonstrates the idea of grid searching, which could be used as a starting
point. Grid search seems conceptually simpler than KDTrees.
Original comment by orbeckst
on 10 Feb 2014 at 4:48
Original issue reported on code.google.com by
orbeckst
on 1 May 2013 at 12:31