facebookresearch / dadaptation

D-Adaptation for SGD, Adam and AdaGrad
MIT License
501 stars 19 forks source link

Learning rates #29

Closed RealBeyondMaster closed 1 year ago

RealBeyondMaster commented 1 year ago

Hello I am using Kohya_ss scripts for training LoRa and LoCons for stable diffusion, and can't get DAdaptAdaGrad to work. With my settings, it learn nothing. I also tried learning rate 1, 0.3 etc. Nothing. What are settings for Learning Rate, Text Encoder learning rate, Unet learning rate. Might you know some working ones for image learning? Thank you.

adefazio commented 1 year ago

I would recommend using DAdaptAdam instead, AdaGrad is not effective on many problems and generally the D-Adaptation variant of it adapts the LR very slowly compared to the DAdaptAdam.

RealBeyondMaster commented 1 year ago

Thank you, I will retry it with DAdaptAdam.