pele-python / pele

Python energy landscape explorer
Other
95 stars 41 forks source link

Implement proper iterators over atom pair in cell lists #101

Closed js850 closed 9 years ago

js850 commented 9 years ago

I replaced the array of atom pairs with an iterator over atom pairs.

To do this using generators would be trivial (more or less)

for icell, jcell in cell_neighbors:
    for iatom in cell_iterator(icell):
        for jatom in cell_iterator(jcell):
            yield iatom, jatom

Unfortunately c++ doesn't have co-routines, so I basically had to invert the triple loop. It turned out to be ugly, but it works.

upsides

  1. no need to build and store the full vector of atom pairs (less memory)
  2. iterators are in-theory easier to work with. And more flexible in the long run.

downsides

  1. it's currently a bit slower (44 seconds vs 41 seconds for a minimization of 1600 LJ atoms). However, it's far from optimized. I may be able to improve it.
  2. the code is ugly as hell, confusing, and hard to debug.

What do you think, should we merge this PR?

kjs73 commented 9 years ago

Cool, it is a very good idea to make this thing into an iterator! I feel that mostly it is good to accept within reason longer CPU time for saving memory, because it is in general easier to get more CPU time than more (and fast) memory.

To say anything competent I would need to look at the code much more carefully. Just two fast thoughts.

  1. As far as C++ is concerned, we can hope that the Boost.Coroutines make it into the standard. Would that solve the issue here?
  2. For beauty and confusion level of the code, would it help to encapsulate/structure it more?
js850 commented 9 years ago

As far as C++ is concerned, we can hope that the Boost.Coroutines make it into the standard.

yes, that would be great.

For beauty and confusion level of the code, would it help to encapsulate/structure it more?

This is by no means the most elegant way of doing it. It's just what I came up with, and I don't have much experience converting multiple loops into an iterator. I'd be happy for suggestions.

If I remember correctly, in boost graph library (e.g. breadth first search), they don't use iterators, they use loops with callback functions. The callback functions are implemented as templates to avoid speed losses. That might be another idea.

kjs73 commented 9 years ago

Looks good to me.

js850 commented 9 years ago

I improved the speed of the iterators by removing some of the unnecessary if (done()) calls. It is now effectively the same speed as the previous implementation.

js850 commented 9 years ago

@ruehle : I added a pure c++ benchmarking script in cpp_tests/source/benchmarks/. The benchmark is compiled along with the c++ tests, but is run as a separate executable. Currently the script simply does a minimization of a large LJ system.

coveralls commented 9 years ago

Coverage Status

Changes Unknown when pulling 04db08e127527441e507704ec55bbb841e68deac on js850:cell_lists into \ on pele-python:master**.

js850 commented 9 years ago

I abandoned the iterator concept and now use the visitor design pattern. It is faster and simpler. #108 now supersedes this PR