Open Daisy-GENG opened 2 years ago
Issue #7 still referred to PyCave version 2. In PyCave v3, you don't need to call gmm.model_.reset_parameters
: the model_
attribute will only be available once fit
has returned without error.
I believe that this should be the line that causes your error.
So is there a similar way to implement batch training in PyCave version 3 using dataloader? My whole dataset is large, so I cannot load all the data into the memory once.
Thank you so much!
Best regards, Daisy
Ah, sorry! Yes, you can simply set the batch size when initializing the GMM. In your case, you might, for example, use:
gmm = GM(..., batch_size=8192)
This will automatically take care to load data in batches, both for initialization and GMM training. Note that you might be better off with init_strategy='kmeans++'
since kmeans
is quite costly to run. You'll need PyCave 3.1.3 for that, though (there was a bug for kmeans++
initialization before).
Hi,
I want to implement mini-batching training on GMM as discussed in #7 . However, I am little bit confused by the code
gmm.reset_parameters(torch.Tensor(fvectors[:500].astype(np.float32)))
. I am not sure whether it is related to my version of pycave, or maybe my understanding to the code in #7 is wrong. My code doesn't work.My code are as follows:
And the error is:
Thank you so much!
Best regards, Daisy