pele-python / pele

Python energy landscape explorer
Other
95 stars 41 forks source link

refactor CellIter and CellListPotential #98

Closed js850 closed 9 years ago

js850 commented 10 years ago

changes to CellIter

  1. add a lot of documentation to CellIter
  2. remove unused functions and member variables
  3. coords is no longer passed to the constructor of CellIter. I had to update all the dependencies because of this. This means that things like natoms can only be defined the first time reset(coords) is called. This is probably the first time get_energy() is called.

I also refactored CellListsPotential

  1. Add a new class PairIteratorPotential which is simply a generalization of CellListsPotential
  2. CellListsPotential is now just a wrapper for PairIteratorPotential

In the future we should reformat some of the other potentials like NeighborListPotential to use PairIteratorPotential

js850 commented 10 years ago

@smcantab @kjs73 I added a CellLists potential for LJCut and added a test in python which just compares it to the simple version of LJCut. It was failing sporadically, so I found a specific example where it fails. Is this a bug in CellIter?

smcantab commented 10 years ago

I start to think so, I have some evidence that something is wrong. Julian and I are having issues with the wca potential too.

Stefano

On Fri, Nov 21, 2014 at 5:48 PM, Jacob Stevenson notifications@github.com wrote:

@smcantab https://github.com/smcantab @kjs73 https://github.com/kjs73 I added a CellLists potential for LJCut and added a test in python which just compares it to the simple version of LJCut. It was failing sporadically, so I found a specific example where it fails. Is this a bug in CellIter?

— Reply to this email directly or view it on GitHub https://github.com/pele-python/pele/pull/98#issuecomment-64009465.

smcantab commented 10 years ago

@kjs73 @js850

Ok there is definitely a bug in the cell lists implementation. I have noticed the following:

this is both the case for Jake's LJ potential example and HS_WCA_cell_lists test (here it does not only fail after the minimisation but even the energies of the initial coordinates differ between the cell_lists and the simple potential).

js850 commented 10 years ago

I'm looking at build_cell_neighbor_lists, which uses the function areneighbors (confusing name by the way), which in turn uses the function get_minimum_corner_distance2

Isn't this wrong? Don't we want the maximum corner distance?

https://github.com/js850/pele/blob/cell_lists/source/pele/neighbor_iterator.h#L361

js850 commented 10 years ago

FYI, if I put

bool areneighbors(size_t i, size_t j) { return true; }

everything passes the test. So it's seems like it's that function which breaks things.

js850 commented 10 years ago

I fixed the bug in CellIter. get_minimum_corner_distance2 was not implemented correctly. I haven't done extensive testing, but it passes the existing tests that were failing.

js850 commented 10 years ago

I did some more checking, and added some more tests. I'm pretty sure CellIter is working now.

js850 commented 10 years ago

However, I think the function build_atom_neighbors_list() can be implemented much more efficiently.

https://github.com/js850/pele/blob/cell_lists/source/pele/neighbor_iterator.h#L499

For large number of cells it becomes extremely slow. Even for ncellx_scale=4 it is noticably slow. For ncellx_scale=10 I ran out of patience before it finished.

smcantab commented 10 years ago

Awesome! Thanks a lot Jake! I'll take a look at that function tomorrow or Monday, I think that may be the reason why when profiling the cellists alone take 20% of the time. I'll try to make it run faster, if you didn't do it already at this point. On 22 Nov 2014 18:12, "Jacob Stevenson" notifications@github.com wrote:

However, I think the function build_atom_neighbors_list() can be implemented much more efficiently.

https://github.com/js850/pele/blob/cell_lists/source/pele/neighbor_iterator.h#L499

For large number of cells it becomes extremely slow. Even for ncellx_scale=4 it is noticably slow. For ncellx_scale=10 I ran out of patience before it finished.

— Reply to this email directly or view it on GitHub https://github.com/pele-python/pele/pull/98#issuecomment-64089589.

js850 commented 10 years ago

I re-wrote CellIter::build_atom_neighbors_list. The new algorithm should be a lot simpler and avoid looping through duplicate pairs. I haven't tested whether it is actually faster though. It is still pretty slow when the number of cells becomes large.

smcantab commented 10 years ago

Just a guess but could that be because we double count the cell list pairs? On 23 Nov 2014 10:22, "Jacob Stevenson" notifications@github.com wrote:

I re-wrote CellIter::build_atom_neighbors_list. The new algorithm should be a lot simpler and avoid looping through duplicate pairs. I haven't tested whether it is actually faster though. It is still pretty slow when the number of cells becomes large.

— Reply to this email directly or view it on GitHub https://github.com/pele-python/pele/pull/98#issuecomment-64113173.

smcantab commented 10 years ago

By which I mean, not the atom pairs but the cells pairs On 23 Nov 2014 10:53, "Stefano Martiniani" stefano.martiniani@gmail.com wrote:

Just a guess but could that be because we double count the cell list pairs? On 23 Nov 2014 10:22, "Jacob Stevenson" notifications@github.com wrote:

I re-wrote CellIter::build_atom_neighbors_list. The new algorithm should be a lot simpler and avoid looping through duplicate pairs. I haven't tested whether it is actually faster though. It is still pretty slow when the number of cells becomes large.

— Reply to this email directly or view it on GitHub https://github.com/pele-python/pele/pull/98#issuecomment-64113173.

js850 commented 10 years ago

Just a guess but could that be because we double count the cell list pairs?

I don't think that's happening.

I haven't been able to profile because I don't have it set up at home. But I suspect build_atom_neighbors_list is running fast (which is important) and build_cell_neighbors_list is slow (which doesn't really matter because it's run only once).

coveralls commented 10 years ago

Coverage Status

Coverage remained the same when pulling f815f51d0c1d530e6f753113178484c5b4147730 on js850:cell_lists into ff09c7a5c9aa23fcf4d5e2c7c087eb237fa6a9a7 on pele-python:master.

js850 commented 10 years ago

p.s. I added a bunch of warning print statements if the parameters of CellIter seem inefficient.

kjs73 commented 9 years ago

Hey, that is excellent! The fixed corner distance finding looks good to me. As soon as this is integrated, I will try to add also the following branch which has some additional tests and input checks: https://github.com/kjs73/pele/tree/hs_wca_examples

js850 commented 9 years ago

I did some profiling for CellLists with a LJ system with 1600 aoms, rcut=2.5, and boxlength=17. For reasonable parameter regimes build_atom_neighbors_list was taking about 12% of the time. This seems reasonable to me.

However, if the number of cells becomes very large, e.g. ncellx_scale=4., then build_cell_neighbors_list just takes forever. For practical purposes this doesn't matter so much because it's only called once. But I'm sure a more intelligent algorithm could be found.

I would like to merge this this afternoon. Please check it and make sure it doesn't break any of your potentials

js850 commented 9 years ago

an idea for speed increases. When profiling you find that get_rij() takes about 50% of the computing time. The wiki page for cell lists suggests for periodic wrapping that you store an offset vector associated with each pair of cells.

http://en.wikipedia.org/wiki/Cell_lists

If this works then you could just use simple cartesian distances after applying the offset. It might be much faster.

Also, I haven't read it in detail, but this paper http://www.sciencedirect.com/science/article/pii/S0010465598002033 might have a better algorithm for build_cell_neighbors_list

kjs73 commented 9 years ago

To me, this looks fine! It seems that the HS-WCA potential does not break, because it ignores the fact that we pass to it reference_coords which do not exist anymore. For example, with this branch, the following script gives the same result with and without cell lists, which was not the case before: playground/hs_wca_cell_lists/hs_wca_cell_lists.py