issues
search
JonathanChiang
/
TidyShell
2
stars
0
forks
source link
Better Models for Stochastic Optimization
#7
Open
JonathanChiang
opened
5 years ago
JonathanChiang
commented
5 years ago
Robustness is Important:
Capacity and Trainability in Recurrent Neural Networks:
Energy Spent in Training???????
Camry to SF to LA
Robustness:
common RNN architectures achieve the same per-task
Camry in 10 -> 100 -> 1000 -> 10,000
Stochastic Gradient Methods:
minimize a function:
Weakly Convex Functions:
Why We use This??
easy to analyze:
default packages?
works?
Linear Regression:
U- shaped algorithm (sgm, truncated, prox)
Optimization Methods:
good but simple local model of f
minimize the model
regularizing
Optimization Methods:
how to solve optimization problems:
minimize a model (regularizing)
Newton's Method:
Taylor (second-order) model:
Composite Optimization Problems:
Modeling Composite Problems:
convex model
Modeling Composite Problems:
now we make a convex model
Modeling Composite Problems
Generic Optimization Methods:
aProx family for stochastic optimization:
Models in Stochastic Optimization:
conditions on our models (convex case)
lower bound
local correctness
Divergence of a gradient method
Conclusion:
blind application of SGD is not right answer
care and better modeling can yield improved performance
computational efficieny c
Robustness is Important:
Stochastic Gradient Methods:
Weakly Convex Functions:
Why We use This??
Linear Regression:
Optimization Methods:
Optimization Methods:
Newton's Method:
Composite Optimization Problems:
Modeling Composite Problems:
Modeling Composite Problems:
Modeling Composite Problems
Generic Optimization Methods:
aProx family for stochastic optimization:
Models in Stochastic Optimization:
Divergence of a gradient method
Conclusion: