bmaltais / kohya_ss

Apache License 2.0
9.34k stars 1.21k forks source link

[BUG] Can't Resume Training if "DoRA Weight Decompose" is checked #2705

Open Mast3rBlast3r opened 1 month ago

Mast3rBlast3r commented 1 month ago

I tested LyCORIS/LoCon preset for SDXL and it works fine, but if the box "DoRA Weight Decompose" is checked (with or without extra algorithms) then it will not continue. I tried default preset with 'Gradient checkpointing' and 'Memory efficient attention' checked, with Mixed precision and Save precision as bf16, and with Mixed precision and Save precision as fp16 and nothing, I can't make him to continue training because I don't know what the problem is. Has anyone been able to continue training with "DoRA Weight Decompose" box checked? train1 train2

v0xie commented 1 month ago

I'm pretty sure you can change line 4209 of sd-scripts/library/train_util.py to accelerator.load_state(args.resume, strict=False) and it'll work.

Mast3rBlast3r commented 1 month ago

I changed that line in sd-scripts/library/train_util.py and tested it, now it works, problem solved, thank you.