You can now use dimensionality reduction with GMM. This is useful as GMM is very fast, even for large datasets, but doesn't handle high dimensionality very well due to a vast parameter space.
Usage:
You can pass any TransformerMixin to GMM as the dimensionality_reduction parameter.
from turftopic import GMM
from sklearn.decomposition import PCA
gmm = GMM(10, dimensionality_reduction=PCA(20))
Performance:
I did some experiments on my machine, and with PCA(20) and 20 topics I get virtually the same results as with the full model on 20 Newsgroups, but the model runs in under half a minute instead of three minutes, which I think is impressive.
Rationale:
You can now use dimensionality reduction with GMM. This is useful as GMM is very fast, even for large datasets, but doesn't handle high dimensionality very well due to a vast parameter space.
Usage:
You can pass any
TransformerMixin
to GMM as thedimensionality_reduction
parameter.Performance:
I did some experiments on my machine, and with
PCA(20)
and 20 topics I get virtually the same results as with the full model on 20 Newsgroups, but the model runs in under half a minute instead of three minutes, which I think is impressive.