-
**Is your feature request related to a problem? Please describe.**
For some model configurations (Poisson GLM, soft-plus, with Ridge regularization) and optimal stepsize and batch size can be calcu…
-
Expand the tutorial on batching comparing SGD and SVRG:
1. Show that, if you set the stepsize of the right size, the algorithm converge to the correct estimate
2. Compare with SGD, which won't con…
-
This is an interesting stochastic optimizer with some nice theoretical guarantees for convex problems. Would be interesting to compare to the others we have implemented already.
https://papers.nips.c…
-
We should add dedicated doc page with a table listing all the solver options we allow, and their characteristics.
-
@epapoutsellis and i saw differences between this SVRG implementation and a test-one from myself. Checking the code
https://github.com/epapoutsellis/StochasticCIL/blob/f9f67d5adcac215eb51db4e0b9022ec…
-
The theory gives step sizes for which the algorithms are guaranteed to converge (see papers). We need to add an `eta="auto"` option.
-
What's the speedup of distributed SVRG on word2vec? And can SVRG be used in CNN? Thanks a lot!
-
not sure whether we should include these or not. while nice for comprehensiveness, if our ai-SGD is ultimately superior to them then it would only confuse users to include "inadmissible" estimators as…
-
I'm currently working on neural networks for my Master's thesis and I stumbled upon the optimizer described in this paper:
https://papers.nips.cc/paper/4937-accelerating-stochastic-gradient-descent-u…
-
It would be nice to have an example to compare speed of convergence of SAGA/SVRG/SFW on problems attaining the same optimum.