ldeecke / gmm-torch

Gaussian mixture models in PyTorch.
MIT License
496 stars 86 forks source link

gmm learning in batches on GPU? #2

Closed songkuan79 closed 4 years ago

songkuan79 commented 5 years ago

Hi there,

I always wanted to run a GMM on a dataset at petabytes-scale. sklearn surely cannot do that. Would it be possible to run in pytorch on gpu? maybe a gradient learning on batches?

please let me know your thoughts. thanks.

ldeecke commented 5 years ago

Look into nn.DataParallel(model). The Torch website has a tutorial: https://pytorch.org/tutorials/beginner/blitz/data_parallel_tutorial.html.

Scaling this up could be tricky, but a partial fit might well be sufficient, c.f. https://stackoverflow.com/questions/29095769/sklearn-gmm-on-large-datasets.

Vichoko commented 4 years ago

With the current state of this library, it is possible to train in batches? I couldn't find partial_fit method.