markovmodel / PyEMMA

🚂 Python API for Emma's Markov Model Algorithms 🚂
http://pyemma.org
GNU Lesser General Public License v3.0
311 stars 119 forks source link

kmeans warning #439

Closed franknoe closed 9 years ago

franknoe commented 9 years ago

Only raise this warning when the memory is actually tight (check available memory before). There's not problem in the given example (11 MB) because we often create much bigger arrays without asking:

2015-07-22 11:07:18,815 coordinates.clustering.KmeansClustering[3] WARNING  K-means implementation is currently memory inefficient. This calculation needs 11 megabytes of main memory. If you get a memory error, try using a larger stride.
marscher commented 9 years ago

for checking if sys mem is low, we need a library like psutil to do this platform independently. Do you want to add this as a dependency for triggering a more meaningful message? Minibatch kmeans will fix this issue anyways in 1.3

marscher commented 9 years ago

I'd suggest to just increase the threshold and leave it like it is.

franknoe commented 9 years ago

Minibatch k-means is still a different algorithm, with different convergence properties. We can't just replace k-means by minibatch k-means, we can offer it in addition (for the API we can discuss if we want to just do this by a flag switch or if we want a new API function such as cluster_kmeans_mini). Therefore, it won't fix the k-means memory issue.

I think psutil would be a valuable dependency, but it is also possible that we can solve memory bottlenecks by caching. Let us simply set a cutoff for now: If we allocate less than 100 MB I wouldn't even raise a warning. If you are running out of memory with that, your machine is probably already dying. Plus we frequently allocate bigger things without asking, e.g. TICA matrices.

Am 22/07/15 um 19:21 schrieb Martin K. Scherer:

for checking if sys mem is low, we need a library like psutil to do this platform independently. Do you want to add this as a dependency for triggering a more meaningful message? Minibatch kmeans will fix this issue anyways in 1.3

— Reply to this email directly or view it on GitHub https://github.com/markovmodel/PyEMMA/issues/439#issuecomment-123797262.


Prof. Dr. Frank Noe Head of Computational Molecular Biology group Freie Universitaet Berlin

Phone: (+49) (0)30 838 75354 Web: research.franknoe.de

Mail: Arnimallee 6, 14195 Berlin, Germany

marscher commented 9 years ago

done