kohya-ss / sd-scripts

Apache License 2.0
5.31k stars 880 forks source link

error when trying to use --blockwise_fused_optimizers with SD3.5 L finetune training. #1755

Closed Ice-YY closed 1 week ago

Ice-YY commented 3 weeks ago

I encountered an error when trying to use the --blockwise_fused_optimizers option; however, the --fused_backward_pass ran without any issues. What could be the cause? Thank you. Here is the error messages:

INFO using 41 optimizers for blockwise fused optimizers sd3_train.py:493 override steps. steps for 10 epochs is / 指定エポックまでのステップ数: 280 Traceback (most recent call last): File "/AI/AI_paint/kohya_ss/sd-scripts/sd3_train.py", line 1217, in train(args) File "/AI/AI_paint/kohya_ss/sd-scripts/sd3_train.py", line 746, in train block_type, block_idx = block_types_and_indices[opt_idx] IndexError: list index out of range Traceback (most recent call last): File "/AI/AI_paint/kohya_ss/venv/bin/accelerate", line 8, in sys.exit(main()) File "/AI/AI_paint/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 48, in main args.func(args) File "/AI/AI_paint/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1106, in launch_command simple_launcher(args) File "/AI/AI_paint/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 704, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)

kohya-ss commented 2 weeks ago

--blockwise_fused_optimizers is not tested yet. We recommend --fused_backward_pass for now.

kohya-ss commented 1 week ago

--blockwise_fused_optimizers works on the sd3 branch, but if --full_bf16 is enabled, we recommend using --fused_backward_pass due to stochastic rounding.