Closed franknoe closed 8 years ago
Nice improvement! I would also remove the obsolete code (maybe already in this PR)
I think it's good to have one version on which this test code in the description of this PR works. Once I remove the deprecated methods, this won't work anymore.
You're right.
Nice!
This PR mainly removes inefficiencies of transition matrix counting encountered in cases of many states or many trajectories. Additionally, some PEP coding violations were fixed and deprecated comments and TODO statements were removed.
In some cases (significant more than 1000 states or more than 10000 trajectories), transition matrix counting could have been really slow.
msmtools.estimation.count_matrix
is now using the new counting methodmsmtools.estimation.sparse.count_matrix.count_matrix_coo2_mult
. To effect on performance is illustrated by following test code:Results (time in milliseconds) on my machine (MacOS 10.10.1, Intel i7@1.7 GHz, 8 GB Ram):
It is seen that while the new method does not always win, it never leads to an explosion of the runtime while the two previous methods could become really useless with big data. If there are no objections, I will remove the old counting methods in a subsequent PR.