Closed HubLot closed 8 years ago
Not so simple the use of decorator.... But why not.
Yep, I know. This can be removed if you want. Since, with the second part, the constructor of Atom() is called much less often than before. So the gain of performance is minimal for a xtc trajectory, I don't know about many multiple PDBs. In fact, I came up with the second optimization after the first one... :)
This is awesome! The code is much clearer.
@HubLot Do not touch anything. This is really good! I will eventually learn decorator ;-)
Hi,
I continue my quest of optimize PBxplore. In this PR, I focused mainly on the reading of xtc trajectory through MDAnalysis and I was able to gain ~15% speed-up.
First, I introduced classmethod property for the class
Atom
. I noticed we usually create an emptyAtom()
and then populate it with values depending on the input (PDB, PDBx, XTC). Creating an empty object and then modify its attributes is not really efficient.@classmethod
allow us to redefine a function to be a constructor. Create a object and set its attribute is now done once.For example:
atom = Atom(); atom.read_PDB(line)
is nowatom = Atom.read_PDB(line)
I also removed some
Atom()
attributes which were never used (occupancy, temperature, etc). This does not change the output offormat()
function. I also realized that:is faster than
Second, I focused on the
chains_from_trajectory()
function. I noticed, at each frame, we created a newchain
object (calledstructure
) filled with new atoms while in fact, only the coordinates of these atoms changed during a trajectory. It was a bit overkill :) Now, only onestructure
object is created at the beginning. During the loop over the trajectory, only the coordinates of the atoms are updated. To do that, I introduced some new functions insideAtom()
(a setter to modify the coordinates) andChain()
( a function which modify all coordinates of all atoms). With these optimizations, thechains_from_trajectory()
function cannot be faster unless touching MDAnalysis. Creating the universe object and looping over the trajectory take now 95% of the time of this function whereas it was ~50% before.Eventually, all of these optimizations don't change the public API.
Let me know what you think!
P.S: I fixed an issue with the doc build and remove some weird print() (my fault) in tests.