derekgreene / dynamic-nmf

Dynamic Topic Modeling via Non-negative Matrix Factorization
Apache License 2.0
282 stars 87 forks source link

Execution time #10

Open michalovadek opened 4 years ago

michalovadek commented 4 years ago

I was wondering about what kinds of runtime have you encountered in the practical application of this topic model (leaving aside the question of choosing K). In my limited experience, the scikit NMF decomposition algorithm has been extremely fast for small corpora (a matter of seconds) but it slows down drastically at higher K and larger matrices. I have a model currently running with K=20 on a sparse matrix with 4.3 million cells and it has been going for hours. Compared to standard LDA, this is significantly slower.

The scikit learn documentation mentions polynomial time complexity, which would explain the huge changes in execution time I experienced, and I would like to understand whether this is an issue for others as well.

BradKML commented 3 years ago

Take a look at these repos if speed might be concerned: