Fix performance issues when dealing with large xtc trajectories

At the lab, we noticed the PB assignment (with PBassign) took a very long time for large xtc trajectories (several Gb). After some benchmarks, PBassign took ~0.4-0.45s to process one frame. This is a very long time. For example, a 25k frames xtc will took approximatively 3h to be processed.

This PR aims to significantly decrease the process time per frame. With this version, I was able to achieve ~0.08-0.10s/frame, almost a 5-fold gain.

The main solution was to get rid of several numpy calls. Indeed, numpy functions can induce a big overhead when dealing with small arrays like ours (for recall, we dealing mainly with 3-items arrays). I realized that built-in functions (like map(), sum(), comprehension list) are way more fast than their numpy fellows. (Note: the cast of list() with map() is because py2 returns a list and py3 returns a iterator). Sometimes, some numpy functions are faster than others. Here, it's faster to do a numpy.sqrt(n1.dot(n1)) than a numpy.linalg.norm(n1) but a built-in version will be slower.

Eventually, I optimize the MDAnalysis parsing by creating only one selection. It was pointless to create a selection at every frame since a selection is a view on the structure not on the time.

The main bottleneck now is on these 2 lines:

I think the first one can be optimize if we rethink a little bit how we parse xtc.

I attached 2 files (statsline_before.txt and statsline_after.txt) which are the results of line_profiler before and after the optimization.

statsline_before.txt statsline_after.txt

pierrepo / PBxplore

Fix performance issues when dealing with large xtc trajectories #114