Open wuliang19869312 opened 1 month ago
Strange, because I always see at your script call the --highvram Parameter. - If that's correct if you chose the 12G Option? Maybe it is, but I doubt it.
Strange, because I always see at your script call the --highvram Parameter. - If that's correct if you chose the 12G Option? Maybe it is, but I doubt it.
The one I called up was 12G~~ I changed the parameters a little bit myself to make it work, but just now I found out that the whole system doesn't work~~
Traceback (most recent call last):
File "D:\fluxgym\app.py", line 17, in
Strange, because I always see at your script call the --highvram Parameter. - If that's correct if you chose the 12G Option? Maybe it is, but I doubt it.
The one I called up was 12G~~ I changed the parameters a little bit myself to make it work, but just now I found out that the whole system doesn't work~~ Traceback (most recent call last): File "D:\fluxgym\app.py", line 17, in import train_network ModuleNotFoundError: No module named 'train_network'
same error here.
很奇怪,因为我总是看到你的脚本调用 --highvram 参数。- 如果你选择了 12G 选项,这是正确的吗?也许是,但我对此表示怀疑。
我调出来的是12G
自己稍微改了一下参数,就搞定了,刚才发现整个系统不行了Traceback (most recent call last): File "D:\fluxgym\app.py", line 17, in import train_network ModuleNotFoundError: No module named 'train_network'这里也有同样的错误。 Did you fix it back there? Yesterday it was fine, today this problem is happening again~~
[2024-09-19 00:07:03] [INFO] D:\fluxgym\datasets\fuji
[2024-09-19 00:07:03] [INFO] contains 15 image files
[2024-09-19 00:07:03] [INFO] Traceback (most recent call last):
[2024-09-19 00:07:03] [INFO] File "D:\fluxgym\sd-scripts\flux_train_network.py", line 519, in
@byteconcepts fluxgym uses the --highvram parameter no matter which VRAM size is selected. Which doesn't make much sense, given that it explicitly removes optimization for lower VRAM sizes according to https://github.com/kohya-ss/sd-scripts/releases :
An option --highvram to disable the optimization for environments with little VRAM is added to the training scripts. If you specify it when there is enough VRAM, the operation will be faster.
[2024-09-16 12:53:56] [INFO] torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 80.00 MiB. GPU 0 has a total capacity of 11.99 GiB of which 6.73 GiB is free. Of the allocated memory 4.00 GiB is allocated by PyTorch, and 14.50 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables) [2024-09-16 12:53:57] [INFO] Traceback (most recent call last): [2024-09-16 12:53:57] [INFO] File "", line 198, in _run_module_as_main
[2024-09-16 12:53:57] [INFO] File "", line 88, in _run_code
[2024-09-16 12:53:57] [INFO] File "D:\fluxgym\env\Scripts\accelerate.exe__main__.py", line 7, in
[2024-09-16 12:53:57] [INFO] File "D:\fluxgym\env\Lib\site-packages\accelerate\commands\accelerate_cli.py", line 48, in main
[2024-09-16 12:53:57] [INFO] args.func(args)
[2024-09-16 12:53:57] [INFO] File "D:\fluxgym\env\Lib\site-packages\accelerate\commands\launch.py", line 1106, in launch_command
[2024-09-16 12:53:57] [INFO] simple_launcher(args)
[2024-09-16 12:53:57] [INFO] File "D:\fluxgym\env\Lib\site-packages\accelerate\commands\launch.py", line 704, in simple_launcher
[2024-09-16 12:53:57] [INFO] raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
[2024-09-16 12:53:57] [INFO] subprocess.CalledProcessError: Command '['D:\fluxgym\env\Scripts\python.exe', 'sd-scripts/flux_train_network.py', '--pretrained_model_name_or_path', 'D:\fluxgym\models\unet\flux1-dev.sft', '--clip_l', 'D:\fluxgym\models\clip\clip_l.safetensors', '--t5xxl', 'D:\fluxgym\models\clip\t5xxl_fp16.safetensors', '--ae', 'D:\fluxgym\models\vae\ae.sft', '--cache_latents_to_disk', '--save_model_as', 'safetensors', '--sdpa', '--persistent_data_loader_workers', '--max_data_loader_n_workers', '2', '--seed', '42', '--gradient_checkpointing', '--mixed_precision', 'bf16', '--save_precision', 'bf16', '--network_module', 'networks.lora_flux', '--network_dim', '4', '--optimizer_type', 'adafactor', '--optimizer_args', 'relative_step=False', 'scale_parameter=False', 'warmup_init=False', '--split_mode', '--network_args', 'train_blocks=single', '--lr_scheduler', 'constant_with_warmup', '--max_grad_norm', '0.0', '--learning_rate', '8e-4', '--cache_text_encoder_outputs', '--cache_text_encoder_outputs_to_disk', '--fp8_base', '--highvram', '--max_train_epochs', '8', '--save_every_n_epochs', '4', '--dataset_config', 'D:\fluxgym\dataset.toml', '--output_dir', 'D:\fluxgym\outputs', '--output_name', 'ds-lora', '--timestep_sampling', 'shift', '--discrete_flow_shift', '3.1582', '--model_prediction_type', 'raw', '--guidance_scale', '1', '--loss_type', 'l2']' returned non-zero exit status 1.
[2024-09-16 12:53:58] [ERROR] Command exited with code 1
[2024-09-16 12:53:58] [INFO] Runner: