kohya-ss / sd-scripts

Apache License 2.0
5.31k stars 880 forks source link

New ScheduleFree support for Flux #1600

Closed sdbds closed 2 months ago

sdbds commented 2 months ago

Old version Details

Whats news? Because testing found that Flux training works well using AdamWScheduleFree. image

1、Add version in requirements.txt and add old schedulefree optimizer like AdamWschedulefree and SGDScheduleFree Directly using like --optimizer_type=AdamWschedulefree

2、(Experiment function)Add ScheduleFreeWarpper for using any base torch.opti.optimizer like Adam、RAdam、SparseAdam etc...

use --optimizer_schedulefree_wrapper to open options use --schedulefree_wrapper_args add extra schedule free args.

SupportList:

from .adadelta import Adadelta
from .adagrad import Adagrad
from .adam import Adam
from .adamax import Adamax
from .adamw import AdamW
from .asgd import ASGD
from .lbfgs import LBFGS
from .nadam import NAdam
from .optimizer import Optimizer
from .radam import RAdam
from .rmsprop import RMSprop
from .rprop import Rprop
from .sgd import SGD
from .sparse_adam import SparseAdam

example: --optimizer_type= "RAdam" --optimizer_schedulefree_wrapper --schedulefree_wrapper_args momentum=0.9, weight_decay_at_y=0.1

kohya-ss commented 2 months ago

Thank you. I'm sorry I couldn't merge the last PR. I'll work on this sooner this time.

kohya-ss commented 2 months ago

After merging, I noticed that the wrapper doesn't seem to work. I get an error TypeError: ScheduleFreeWrapper is not an Optimizer.

I'll try to fix it, but if you know anything about it, please let me know.