Results of parameter minibatch = 'none' vs 'memoized'

jyuan1322 commented 2 years ago

Hi Will,

Since the minibatch='memoized' option calculates the full gradient, should the results be identical to those from setting minibatch='none' on the same input data?

This was my assumption, but I found that running these two options on a small subset of our single-cell data, with all other parameters unchanged, produced different UMAP plots and loading matrices.

Thanks!

willtownes commented 2 years ago

Hi thanks for your question. No it won't give the same results. The reason is the memoized computation is using "lagged" values of the parameters from slightly earlier iterations in the optimization whereas at the full gradient it recomputes across all observations at every iteration. I did some informal experiments with the memoized approach and found it was not as efficient as the stochastic gradient approach. So if you have big data I would recommend that instead, just don't make the minibatch size too small or you might have numerical instabilities. I think memoization is a really cool idea and perhaps I just didn't implement it very well so I wouldn't give up on it based on these limited results.

jyuan1322 commented 2 years ago

Perfect, thank you!

willtownes / glmpca

Results of parameter minibatch = 'none' vs 'memoized' #35