yoyololicon / pytorch-NMF

A pytorch package for non-negative matrix factorization.
https://pytorch-nmf.readthedocs.io/
MIT License
223 stars 24 forks source link

Batched NMF example on large matrix #27

Open yoyololicon opened 1 year ago

yoyololicon commented 1 year ago

Reference:

austinv11 commented 1 year ago

Hello,

Has there been any update on this? I am interested in applying this to large datasets and would love to be able to run my data in batches, but I am not entirely sure how to best do it with your package.

yoyololicon commented 1 year ago

Hi @austinv11, nice to know you're interested in this.

I plan to implement something similar to this paper to perform NMF on extensive data that cannot be fitted entirely into memory. The mini-batch update scheme doesn't fit the current interface of torchnmf.nmf.BaseComponent and needs a new class to handle it. However, I'm working on other projects and won't work on this soon.

KendallPark commented 4 months ago

@yoyololicon I see this is moved into "In Progress." How goes the implementation? I am interested in applying NMF to very large datasets as well. None of the PyTorch NMF implementations have this feature. Your repo is the best coded and documented compared to the other options out there.

It would be great to have it work with PyTorch DataLoaders and Datasets. If not I might be able to contribute that aspect.

yoyololicon commented 4 months ago

Hi @KendallPark, thanks for asking.

Glad to know you're also interested in this. My collaborator and I are working on this feature, and we hope to make it available this summer.

It would be great to have it work with PyTorch DataLoaders and Datasets. If not I might be able to contribute that aspect.

I'm not sure how the feature will work with PyTorch dataloaders... (I feel the use case here is different from regular deep learning training.) It would be better if you could elaborate more on this.