Closed jyuan1322 closed 2 years ago
Hi thanks for your question. No it won't give the same results. The reason is the memoized computation is using "lagged" values of the parameters from slightly earlier iterations in the optimization whereas at the full gradient it recomputes across all observations at every iteration. I did some informal experiments with the memoized approach and found it was not as efficient as the stochastic gradient approach. So if you have big data I would recommend that instead, just don't make the minibatch size too small or you might have numerical instabilities. I think memoization is a really cool idea and perhaps I just didn't implement it very well so I wouldn't give up on it based on these limited results.
Perfect, thank you!
Hi Will,
Since the minibatch='memoized' option calculates the full gradient, should the results be identical to those from setting minibatch='none' on the same input data?
This was my assumption, but I found that running these two options on a small subset of our single-cell data, with all other parameters unchanged, produced different UMAP plots and loading matrices.
Thanks!