facebookresearch / dadaptation

D-Adaptation for SGD, Adam and AdaGrad
MIT License
501 stars 19 forks source link

First inequality on page 3 derivation? #18

Closed mega-optimus closed 1 year ago

mega-optimus commented 1 year ago

Hi!

I have question about the first inequality on page 3, which is critical because it produces lower bound of D.

It reads like this: sum(dk(f(xk) - f*)) <= D||sn+1|| + sum((rk/2)dk2||gk||2) - (rn+1/2)||sn+1||2

How is it derived? Or any reference that derives it?

Thank you very much !

adefazio commented 1 year ago

This bound is proved as Lemma 5 in the Appendix A, page 19 of the PDF. If you have any questions about the proof, please feel free to ask.

mega-optimus commented 1 year ago

This bound is proved as Lemma 5 in the Appendix A, page 19 of the PDF. If you have any questions about the proof, please feel free to ask.

The proof is well written, thanks!