Stochastic Variance Reduction via Accelerated Dual Averaging for Finite-Sum Optimization

URL: http://arxiv.org/abs/2006.10281v1

In this paper, we introduce a simplified and unified method for finite-sum convex optimization, named \emph{Stochastic Variance Reduction via Accelerated Dual Averaging (SVR-ADA)}. In the nonstrongly convex and smooth setting, SVR-ADA can attain an $O\big(\frac{1}{n}\big)$-accurate solution in $n\log\log n$ number of stochastic gradient evaluations, where $n$ is the number of samples; meanwhile, SVR-ADA matches the lower bound of this setting up to a $\log\log n$ factor. In the strongly convex and smooth setting, SVR-ADA matches the lower bound in the regime $n\le O(\kappa)$ while it improves the rate in the regime $n\gg \kappa$ to $O(n\log\log n +\frac{n\log(1/(n\epsilon))}{\log(n/\kappa)})$, where $\kappa$ is the condition number. SVR-ADA improves complexity of the best known methods without use of any additional strategy such as optimal black-box reduction, and it leads to a unified convergence analysis and simplified algorithm for both the nonstrongly convex and strongly convex settings. Through experiments on real datasets, we also show the superior performance of SVR-ADA over existing methods for large-scale machine learning problems.

ikr7 / arxiv-survey

Stochastic Variance Reduction via Accelerated Dual Averaging for Finite-Sum Optimization #15619