Linaqruf / kohya-trainer

Adapted from https://note.com/kohya_ss/n/nbf7ce8d80f29 for easier cloning
Apache License 2.0
1.83k stars 300 forks source link

Prodigy optimizer_type #277

Open teebarjunk opened 1 year ago

teebarjunk commented 1 year ago

I see #242 says it's implemented, but github search can't find any evidence.

Tried installing with !pip install -U prodigyopt and manually inputting Prodigy as the optimizer_type but it didn't work.

Any one had success?

Reviem0 commented 1 year ago

Tried this aswell, no success here.

MetroByte commented 1 year ago

If this is relevant for you: For "kohya-trainer" google colab:

  1. Add a code block (on top, for example) with code !pip install -U prodigyopt
  2. Proceed setup as usual up to 5.3
  3. You have to change your settings a bit in 5.3-5.5:
    • In optimizer_args set the following (you can change these if you know what are you doing): ["decouple=True", "weight_decay=0.01", "d_coef=2", "use_bias_correction=True", "safeguard_warmup=True"]
    • unet_lr, text_encoder_lr to 1 (important)
  4. Then you have to change your config files in folder LoRA/config. In config_file.toml:

Then you have to change a bit in the kohya-trainer/train_network.py

  1. At line ~12-13 add from prodigyopt import Prodigy
  2. Find and comment line (with #) containing text optimizer_name, optimizer_args, optimizer = … (Line number ~224-227)
  3. Add the following three lines below the commented line:

optimizer_name = "Prodigy" optimizer_args = "['decouple': True, 'weight_decay': 0.01, 'd_coef': 2, 'use_bias_correction': True, 'safeguard_warmup': True]" optimizer = Prodigy(trainable_params)

(first two are for metadata printing only though)

Then training should run fine (I hope)

If you re-run steps 5.2-5.4, config file will be replaced, so better to change config files directly.

Tested but I get overfitting results so early, I don't know how to set Prodigy correctly…

teebarjunk commented 1 year ago

Tested but I get overfitting results so early, I don't know how to set Prodigy correctly…

This guide says to use more epochs than repeats, and that he gets good results.

teebarjunk commented 1 year ago

To whom it may concern, Prodigy works nice and fast, but can easily overcook. So epochs > repeats seems best.

In section 5.2:

Set dataset_repeats to 2.

In section 5.4:

Set num_epochs to something higher like 100

Set save_n_epochs_type_value to 10

Set optimizer_args to ["decouple=True", "weight_decay=0.01", "d_coef=2", "use_bias_correction=True", "safeguard_warmup=False", "betas=0.9,0.999"]

And in the code for 5.4, look for sample_every_n_epochs and set that to like 10, so you don't waste time generating previews every single epoch.

I also found this guide to keep an eye on.