Closed sashaok123 closed 8 months ago
Humm ... Look like Kohya latest code update might have an issue... Revert to the previous release until this is fixed... Sorry about that...
I tested the latest release and I have no problem training with it. I don't understand what is happening with your version. Have you tried reverting back to the previous release to see if training work again?
The error seems to be raised by Pillow, if I was a betting man, i would say that D:\AI\LoRA\works\BetterCallSaul\img\50_BetterCallSaul\BetterCallSaul (1).jpg is corrupted.
yeah i updated and getting great performance. 512,640 training with all default learning rate to test, even with batch size of 1 getting 5-6it/sec. only annoying thing is the triton error every time it saves after each epoch. But I can bare with that when it's 2x the speed. Did the upgrade and replaced Cudnn with 8.9 libs. I'm on a 4090.
Yes, the problem was in the images, I changed the folder, but now another error
CUDA SETUP: Loading binary D:\AI\kohya_ss\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cuda116.dll... use 8-bit AdamW optimizer | {} running training / 学習開始 num train images * repeats / 学習画像の数×繰り返し回数: 2810 num reg images / 正則化画像の数: 0 num batches per epoch / 1epochのバッチ数: 2810 num epochs / epoch数: 1 batch size per device / バッチサイズ: 1 gradient accumulation steps / 勾配を合計するステップ数 = 1 total optimization steps / 学習ステップ数: 2810 steps: 0%| | 0/2810 [00:00<?, ?it/s]epoch 1/1 A matching Triton is not available, some optimizations will not be enabled. Error caught was: No module named 'triton' A matching Triton is not available, some optimizations will not be enabled. Error caught was: No module named 'triton' A matching Triton is not available, some optimizations will not be enabled. Error caught was: No module named 'triton' A matching Triton is not available, some optimizations will not be enabled. Error caught was: No module named 'triton' A matching Triton is not available, some optimizations will not be enabled. Error caught was: No module named 'triton' A matching Triton is not available, some optimizations will not be enabled. Error caught was: No module named 'triton'
In general, problems arise when training SD 1.5, but there is no such error on SD 2.1.
Traceback (most recent call last):
File "D:\AI\kohya_ss\train_network.py", line 773, in
The error you are encountering is an UnboundLocalError, which means that a local variable is being referenced before it has been assigned a value. In your case, the error occurs in the 'train_util.py' file at line 2799:
text_encoder = pipe.text_encoder
The variable 'pipe' is being referenced before it has been assigned a value. To fix this error, you need to ensure that 'pipe' is defined before it is used in the code. You may need to look through the 'train_util.py' file and make sure the 'pipe' variable is correctly defined and assigned a value before line 2799.
Additionally, it seems that the 'train_network.py' script is being called by an 'accelerate.exe' script, and the subprocess is returning a non-zero exit status. This indicates that there is an error occurring in the 'train_network.py' script execution. Fixing the UnboundLocalError might resolve this issue, but if it persists, you'll need to debug the 'train_network.py' script further.
weird i havent come across any errors yet except for the missing triton which can be ignored. i'm doing all my training on 1.5 models.
weird i havent come across any errors yet except for the missing triton which can be ignored. i'm doing all my training on 1.5 models.
And I spend on two models at once, because people on Civitai prefer SD 1.5, and I prefer SD 2.1
`Validating that requirements are satisfied. All requirements satisfied. Load CSS... Running on local URL: http://127.0.0.1:7860
To create a public link, set share=True
in launch()
.
Loading config...
Folder 10_DeathStranding: 281 images found
Folder 10_DeathStranding: 2810 steps
max_train_steps = 1405
stop_text_encoder_training = 0
lr_warmup_steps = 0
accelerate launch --num_cpu_threads_per_process=2 "train_network.py" --enable_bucket --pretrained_model_name_or_path="D:/AI/stable-diffusion-webui-Torch2/models/Stable-diffusion/1.5_SD1.5_Base.safetensors" --train_data_dir="D:/AI/kohya_ss/works/DeathStranding/img" --resolution=512,512 --output_dir="D:/AI/kohya_ss/works/DeathStranding/model" --logging_dir="D:/AI/kohya_ss/works/DeathStranding/log" --network_alpha="256" --save_model_as=safetensors --network_module=networks.lora --text_encoder_lr=5e-5 --unet_lr=0.0001 --network_dim=256 --output_name="15DS" --lr_scheduler_num_cycles="1" --learning_rate="0.0001" --lr_scheduler="linear" --train_batch_size="2" --max_train_steps="1405" --save_every_n_epochs="1" --mixed_precision="fp16" --save_precision="fp16" --caption_extension=".txt" --cache_latents --optimizer_type="AdamW8bit" --clip_skip=2 --bucket_reso_steps=64 --xformers --bucket_no_upscale
A matching Triton is not available, some optimizations will not be enabled.
Error caught was: No module named 'triton'
prepare tokenizer
Downloading (…)olve/main/vocab.json: 100%|██████████████████████████████████████████| 961k/961k [00:00<00:00, 1.40MB/s]
Downloading (…)olve/main/merges.txt: 100%|██████████████████████████████████████████| 525k/525k [00:00<00:00, 1.01MB/s]
Downloading (…)cial_tokens_map.json: 100%|████████████████████████████████████████████████████| 389/389 [00:00<?, ?B/s]
Downloading (…)okenizer_config.json: 100%|█████████████████████████████████████████████| 905/905 [00:00<00:00, 904kB/s]
Use DreamBooth method.
prepare images.
found directory D:\AI\kohya_ss\works\DeathStranding\img\10_DeathStranding contains 281 image files
2810 train images with repeating.
0 reg images.
no regularization images / 正則化画像が見つかりませんでした
[Dataset 0]
batch_size: 2
resolution: (512, 512)
enable_bucket: True
min_bucket_reso: 256
max_bucket_reso: 1024
bucket_reso_steps: 64
bucket_no_upscale: True
[Subset 0 of Dataset 0] image_dir: "D:\AI\kohya_ss\works\DeathStranding\img\10_DeathStranding" image_count: 281 num_repeats: 10 shuffle_caption: False keep_tokens: 0 caption_dropout_rate: 0.0 caption_dropout_every_n_epoches: 0 caption_tag_dropout_rate: 0.0 color_aug: False flip_aug: False face_crop_aug_range: None random_crop: False token_warmup_min: 1, token_warmup_step: 0, is_reg: False class_tokens: DeathStranding caption_extension: .txt
[Dataset 0]
loading image sizes.
100%|██████████████████████████████████████████████████████████████████████████████| 281/281 [00:00<00:00, 9062.80it/s]
make buckets
min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is set, because bucket reso is defined by image size automatically / bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計算されるため、min_bucket_resoとmax_bucket_resoは無視されます
number of images (including repeats) / 各bucketの画像枚数(繰り返し回数を含む)
bucket 0: resolution (384, 576), count: 20
bucket 1: resolution (512, 320), count: 20
bucket 2: resolution (512, 448), count: 10
bucket 3: resolution (512, 512), count: 10
bucket 4: resolution (576, 320), count: 10
bucket 5: resolution (576, 384), count: 40
bucket 6: resolution (640, 384), count: 2680
bucket 7: resolution (704, 320), count: 10
bucket 8: resolution (768, 320), count: 10
mean ar error (without repeats): 0.10572933173203725
prepare accelerator
Using accelerator 0.15.0 or above.
loading model for process 0/1
load Diffusers pretrained models
model is not found as a file or in Hugging Face, perhaps file name is wrong? / 指定したモデル名のファイル、またはHugging Faceのモデルが見つかりません。ファイル名が誤っているかもしれません: D:/AI/stable-diffusion-webui-Torch2/models/Stable-diffusion/1.5_SD1.5_Base.safetensors
Traceback (most recent call last):
File "D:\AI\kohya_ss\train_network.py", line 773, in
accidentally closed
You could try to install torch 1.12.1 again and see if it make a difference. Kohya does not support torch 2 and me adding support for it at installation time is experimental.
You could try to install torch 1.13.1 again and see if it make a difference. Kohya does not support torch 2 and me adding support for it at installation time is experimental.
Yes, it really is. But learning on Torch 2 in SD 2.1 is much faster!
@sashaok123, sorry to pop in on a different topic, but it seems like you've had successes with training loras on SD2.1 while using FP16. I (and several others) have been getting Loss=nan during Lora training when using FP16.
Can you please guide us how to do it?
Off 8bit Adam
ср, 19 апр. 2023 г., 19:49 DKnight54 @.***>:
@sashaok123 https://github.com/sashaok123, sorry to pop in on a different topic, but it seems like you've had successes with training loras on SD2.1 while using FP16. I (and several others) have been getting Loss=nan during Lora training when using FP16 https://github.com/kohya-ss/sd-scripts/issues/385.
Can you please guide us how to do it?
— Reply to this email directly, view it on GitHub https://github.com/bmaltais/kohya_ss/issues/634#issuecomment-1514771476, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGTRT7GPCAM6AONC2AW4FMLXB7UOJANCNFSM6AAAAAAXCFUZ64 . You are receiving this because you were mentioned.Message ID: @.***>
`Validating that requirements are satisfied. All requirements satisfied. Load CSS... Running on local URL: http://127.0.0.1:7860
To create a public link, set
share=True
inlaunch()
. Loading config... Folder 10_DeathStranding: 281 images found Folder 10_DeathStranding: 2810 steps max_train_steps = 1405 stop_text_encoder_training = 0 lr_warmup_steps = 0 accelerate launch --num_cpu_threads_per_process=2 "train_network.py" --enable_bucket --pretrained_model_name_or_path="D:/AI/stable-diffusion-webui-Torch2/models/Stable-diffusion/1.5_SD1.5_Base.safetensors" --train_data_dir="D:/AI/kohya_ss/works/DeathStranding/img" --resolution=512,512 --output_dir="D:/AI/kohya_ss/works/DeathStranding/model" --logging_dir="D:/AI/kohya_ss/works/DeathStranding/log" --network_alpha="256" --save_model_as=safetensors --network_module=networks.lora --text_encoder_lr=5e-5 --unet_lr=0.0001 --network_dim=256 --output_name="15DS" --lr_scheduler_num_cycles="1" --learning_rate="0.0001" --lr_scheduler="linear" --train_batch_size="2" --max_train_steps="1405" --save_every_n_epochs="1" --mixed_precision="fp16" --save_precision="fp16" --caption_extension=".txt" --cache_latents --optimizer_type="AdamW8bit" --clip_skip=2 --bucket_reso_steps=64 --xformers --bucket_no_upscale A matching Triton is not available, some optimizations will not be enabled. Error caught was: No module named 'triton' prepare tokenizer Downloading (…)olve/main/vocab.json: 100%|██████████████████████████████████████████| 961k/961k [00:00<00:00, 1.40MB/s] Downloading (…)olve/main/merges.txt: 100%|██████████████████████████████████████████| 525k/525k [00:00<00:00, 1.01MB/s] Downloading (…)cial_tokens_map.json: 100%|████████████████████████████████████████████████████| 389/389 [00:00<?, ?B/s] Downloading (…)okenizer_config.json: 100%|█████████████████████████████████████████████| 905/905 [00:00<00:00, 904kB/s] Use DreamBooth method. prepare images. found directory D:\AI\kohya_ss\works\DeathStranding\img\10_DeathStranding contains 281 image files 2810 train images with repeating. 0 reg images. no regularization images / 正則化画像が見つかりませんでした [Dataset 0] batch_size: 2 resolution: (512, 512) enable_bucket: True min_bucket_reso: 256 max_bucket_reso: 1024 bucket_reso_steps: 64 bucket_no_upscale: True[Subset 0 of Dataset 0] image_dir: "D:\AI\kohya_ss\works\DeathStranding\img\10_DeathStranding" image_count: 281 num_repeats: 10 shuffle_caption: False keep_tokens: 0 caption_dropout_rate: 0.0 caption_dropout_every_n_epoches: 0 caption_tag_dropout_rate: 0.0 color_aug: False flip_aug: False face_crop_aug_range: None random_crop: False token_warmup_min: 1, token_warmup_step: 0, is_reg: False class_tokens: DeathStranding caption_extension: .txt
[Dataset 0] loading image sizes. 100%|██████████████████████████████████████████████████████████████████████████████| 281/281 [00:00<00:00, 9062.80it/s] make buckets min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is set, because bucket reso is defined by image size automatically / bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計算されるため、min_bucket_resoとmax_bucket_resoは無視されます number of images (including repeats) / 各bucketの画像枚数(繰り返し回数を含む) bucket 0: resolution (384, 576), count: 20 bucket 1: resolution (512, 320), count: 20 bucket 2: resolution (512, 448), count: 10 bucket 3: resolution (512, 512), count: 10 bucket 4: resolution (576, 320), count: 10 bucket 5: resolution (576, 384), count: 40 bucket 6: resolution (640, 384), count: 2680 bucket 7: resolution (704, 320), count: 10 bucket 8: resolution (768, 320), count: 10 mean ar error (without repeats): 0.10572933173203725 prepare accelerator Using accelerator 0.15.0 or above. loading model for process 0/1 load Diffusers pretrained models model is not found as a file or in Hugging Face, perhaps file name is wrong? / 指定したモデル名のファイル、またはHugging Faceのモデルが見つかりません。ファイル名が誤っているかもしれません: D:/AI/stable-diffusion-webui-Torch2/models/Stable-diffusion/1.5_SD1.5_Base.safetensors Traceback (most recent call last): File "D:\AI\kohya_ss\train_network.py", line 773, in train(args) File "D:\AI\kohya_ss\train_network.py", line 152, in train textencoder, vae, unet, = train_util.load_target_model( File "D:\AI\kohya_ss\library\train_util.py", line 2799, in load_target_model text_encoder = pipe.text_encoder UnboundLocalError: local variable 'pipe' referenced before assignment Traceback (most recent call last): File "C:\Users\sasha\miniconda3\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\sasha\miniconda3\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "D:\AI\kohya_ss\venv\Scripts\accelerate.exemain.py", line 7, in File "D:\AI\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main args.func(args) File "D:\AI\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command simple_launcher(args) File "D:\AI\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['D:\AI\kohya_ss\venv\Scripts\python.exe', 'train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=D:/AI/stable-diffusion-webui-Torch2/models/Stable-diffusion/1.5_SD1.5_Base.safetensors', '--train_data_dir=D:/AI/kohya_ss/works/DeathStranding/img', '--resolution=512,512', '--output_dir=D:/AI/kohya_ss/works/DeathStranding/model', '--logging_dir=D:/AI/kohya_ss/works/DeathStranding/log', '--network_alpha=256', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=5e-5', '--unet_lr=0.0001', '--network_dim=256', '--output_name=15DS', '--lr_scheduler_num_cycles=1', '--learning_rate=0.0001', '--lr_scheduler=linear', '--train_batch_size=2', '--max_train_steps=1405', '--save_every_n_epochs=1', '--mixed_precision=fp16', '--save_precision=fp16', '--caption_extension=.txt', '--cache_latents', '--optimizer_type=AdamW8bit', '--clip_skip=2', '--bucket_reso_steps=64', '--xformers', '--bucket_no_upscale']' returned non-zero exit status 1.`
Looking at the error, it seems like it could not find the base model file. Is the path correct?
WORKAROUND: Stop choosing pretrained model from dropdown UI and input the filepath explicitly.
The problem occurs at this line. This line tries to define pipe
but it failed.
https://github.com/bmaltais/kohya_ss/blob/63657088f4c35a376dd8a936f53e9b9a3b4b1168/library/train_util.py#L2794
It seems that this line downloads something 19 requirements, but it fails before completing downloads somehow.
loading model for process 0/1
load Diffusers pretrained models
text_encoder\model.safetensors not found
Fetching 19 files: 74%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████ | 14/19 [01:10<00:25, 5.05s/it]
model is not found as a file or in Hugging Face, perhaps file name is wrong? / 指定したモデル名のファイル、またはHugging Faceのモデルが見つかりません。ファイル名が誤っているかもしれません: runwayml/stable-diffusion-v1-5
Downloading (…)on_pytorch_model.bin: 50%|████████████████████████████████████████████████████████████████▉ | 1.73G/3.44G [01:09<01:09, 24.7MB/s]
Traceback (most recent call last):
File "C:\kohya_ss\train_network.py", line 773, in <module>
train(args)
File "C:\kohya_ss\train_network.py", line 152, in train
text_encoder, vae, unet, _ = train_util.load_target_model(
File "C:\kohya_ss\library\train_util.py", line 2799, in load_target_model
text_encoder = pipe.text_encoder
UnboundLocalError: local variable 'pipe' referenced before assignment
Specifying the pretrained model path explicitly bypass this line because there is the if
statement like this:
https://github.com/bmaltais/kohya_ss/blob/63657088f4c35a376dd8a936f53e9b9a3b4b1168/library/train_util.py#L2786-L2787
To create a public link, set
share=True
inlaunch()
. Loading config... Folder 50_BetterCallSaul: 54 images found Folder 50_BetterCallSaul: 2700 steps max_train_steps = 2700 stop_text_encoder_training = 0 lr_warmup_steps = 0 accelerate launch --num_cpu_threads_per_process=2 "train_network.py" --v2 --v_parameterization --enable_bucket --pretrained_model_name_or_path="D:/AI/stable-diffusion-webui/models/Stable-diffusion/2.1_SD2.1_768.safetensors" --train_data_dir="D:/AI/LoRA/works/BetterCallSaul/img" --resolution=768,768 --output_dir="D:/AI/LoRA/works/BetterCallSaul/model" --logging_dir="D:/AI/LoRA/works/BetterCallSaul/log" --network_alpha="128" --save_model_as=safetensors --network_module=networks.lora --text_encoder_lr=5e-5 --unet_lr=0.0001 --network_dim=128 --output_name="21BCS" --lr_scheduler_num_cycles="1" --learning_rate="0.0001" --lr_scheduler="linear" --train_batch_size="1" --max_train_steps="2700" --save_every_n_epochs="1" --mixed_precision="fp16" --save_precision="fp16" --caption_extension=".txt" --cache_latents --optimizer_type="AdamW8bit" --clip_skip=2 --bucket_reso_steps=64 --xformers --bucket_no_upscale v2 with clip_skip will be unexpected / v2でclip_skipを使用することは想定されていません prepare tokenizer Use DreamBooth method. prepare images. found directory D:\AI\LoRA\works\BetterCallSaul\img\50_BetterCallSaul contains 54 image files 2700 train images with repeating. 0 reg images. no regularization images / 正則化画像が見つかりませんでした [Dataset 0] batch_size: 1 resolution: (768, 768) enable_bucket: True min_bucket_reso: 256 max_bucket_reso: 1024 bucket_reso_steps: 64 bucket_no_upscale: True[Subset 0 of Dataset 0] image_dir: "D:\AI\LoRA\works\BetterCallSaul\img\50_BetterCallSaul" image_count: 54 num_repeats: 50 shuffle_caption: False keep_tokens: 0 caption_dropout_rate: 0.0 caption_dropout_every_n_epoches: 0 caption_tag_dropout_rate: 0.0 color_aug: False flip_aug: False face_crop_aug_range: None random_crop: False token_warmup_min: 1, token_warmup_step: 0, is_reg: False class_tokens: BetterCallSaul caption_extension: .txt
[Dataset 0] loading image sizes. 0%| | 0/54 [00:00<?, ?it/s] ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ D:\AI\kohya_ss\train_network.py:773 in │
│ │
│ 770 │ args = parser.parse_args() │
│ 771 │ args = train_util.read_config_from_file(args, parser) │
│ 772 │ │
│ ❱ 773 │ train(args) │
│ 774 │
│ │
│ D:\AI\kohya_ss\train_network.py:117 in train │
│ │
│ 114 │ │ │ } │
│ 115 │ │
│ 116 │ blueprint = blueprint_generator.generate(user_config, args, tokenizer=tokenizer) │
│ ❱ 117 │ train_dataset_group = config_util.generate_dataset_group_by_blueprint(blueprint.data │
│ 118 │ │
│ 119 │ current_epoch = Value("i", 0) │
│ 120 │ current_step = Value("i", 0) │
│ │
│ D:\AI\kohya_ss\library\config_util.py:436 in generate_dataset_group_by_blueprint │
│ │
│ 433 seed = random.randint(0, 2**31) # actual seed is seed + epoch_no │
│ 434 for i, dataset in enumerate(datasets): │
│ 435 │ print(f"[Dataset {i}]") │
│ ❱ 436 │ dataset.make_buckets() │
│ 437 │ dataset.set_seed(seed) │
│ 438 │
│ 439 return DatasetGroup(datasets) │
│ │
│ D:\AI\kohya_ss\library\train_util.py:597 in make_buckets │
│ │
│ 594 │ │ print("loading image sizes.") │
│ 595 │ │ for info in tqdm(self.image_data.values()): │
│ 596 │ │ │ if info.image_size is None: │
│ ❱ 597 │ │ │ │ info.image_size = self.get_image_size(info.absolute_path) │
│ 598 │ │ │
│ 599 │ │ if self.enable_bucket: │
│ 600 │ │ │ print("make buckets") │
│ │
│ D:\AI\kohya_ss\library\train_util.py:823 in get_image_size │
│ │
│ 820 │ │ │ │ │ │ info.latents_flipped = latent │
│ 821 │ │
│ 822 │ def get_image_size(self, image_path): │
│ ❱ 823 │ │ image = Image.open(image_path) │
│ 824 │ │ return image.size │
│ 825 │ │
│ 826 │ def load_image_with_face_info(self, subset: BaseSubset, image_path: str): │
│ │
│ C:\Users\sasha\miniconda3\lib\site-packages\PIL\Image.py:3298 in open │
│ │
│ 3295 │ for message in accept_warnings: │
│ 3296 │ │ warnings.warn(message) │
│ 3297 │ msg = "cannot identify image file %r" % (filename if filename else fp) │
│ ❱ 3298 │ raise UnidentifiedImageError(msg) │
│ 3299 │
│ 3300 │
│ 3301 # │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
UnidentifiedImageError: cannot identify image file
'D:\AI\LoRA\works\BetterCallSaul\img\50_BetterCallSaul\BetterCallSaul (1).jpg'
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ C:\Users\sasha\miniconda3\lib\runpy.py:196 in _run_module_as_main │
│ │
│ 193 │ main_globals = sys.modules["main"].dict │
│ 194 │ if alter_argv: │
│ 195 │ │ sys.argv[0] = mod_spec.origin │
│ ❱ 196 │ return _run_code(code, main_globals, None, │
│ 197 │ │ │ │ │ "main", mod_spec) │
│ 198 │
│ 199 def run_module(mod_name, init_globals=None, │
│ │
│ C:\Users\sasha\miniconda3\lib\runpy.py:86 in _run_code │
│ │
│ 83 │ │ │ │ │ loader = loader, │
│ 84 │ │ │ │ │ package = pkg_name, │
│ 85 │ │ │ │ │ spec = mod_spec) │
│ ❱ 86 │ exec(code, run_globals) │
│ 87 │ return run_globals │
│ 88 │
│ 89 def _run_module_code(code, init_globals=None, │
│ │
│ in :7 │
│ │
│ 4 from accelerate.commands.accelerate_cli import main │
│ 5 if name == 'main': │
│ 6 │ sys.argv[0] = re.sub(r'(-script.pyw|.exe)?$', '', sys.argv[0]) │
│ ❱ 7 │ sys.exit(main()) │
│ 8 │
│ │
│ C:\Users\sasha\miniconda3\lib\site-packages\accelerate\commands\accelerate_cli.py:45 in main │
│ │
│ 42 │ │ exit(1) │
│ 43 │ │
│ 44 │ # Run │
│ ❱ 45 │ args.func(args) │
│ 46 │
│ 47 │
│ 48 if name == "main": │
│ │
│ C:\Users\sasha\miniconda3\lib\site-packages\accelerate\commands\launch.py:1104 in launch_command │
│ │
│ 1101 │ elif defaults is not None and defaults.compute_environment == ComputeEnvironment.AMA │
│ 1102 │ │ sagemaker_launcher(defaults, args) │
│ 1103 │ else: │
│ ❱ 1104 │ │ simple_launcher(args) │
│ 1105 │
│ 1106 │
│ 1107 def main(): │
│ │
│ C:\Users\sasha\miniconda3\lib\site-packages\accelerate\commands\launch.py:567 in simple_launcher │
│ │
│ 564 │ process = subprocess.Popen(cmd, env=current_env) │
│ 565 │ process.wait() │
│ 566 │ if process.returncode != 0: │
│ ❱ 567 │ │ raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) │
│ 568 │
│ 569 │
│ 570 def multi_gpu_launcher(args): │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
CalledProcessError: Command '['C:\Users\sasha\miniconda3\python.exe', 'train_network.py', '--v2',
'--v_parameterization', '--enable_bucket',
'--pretrained_model_name_or_path=D:/AI/stable-diffusion-webui/models/Stable-diffusion/2.1_SD2.1_768.safetensors',
'--train_data_dir=D:/AI/LoRA/works/BetterCallSaul/img', '--resolution=768,768',
'--output_dir=D:/AI/LoRA/works/BetterCallSaul/model', '--logging_dir=D:/AI/LoRA/works/BetterCallSaul/log',
'--network_alpha=128', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=5e-5',
'--unet_lr=0.0001', '--network_dim=128', '--output_name=21BCS', '--lr_scheduler_num_cycles=1', '--learning_rate=0.0001',
'--lr_scheduler=linear', '--train_batch_size=1', '--max_train_steps=2700', '--save_every_n_epochs=1',
'--mixed_precision=fp16', '--save_precision=fp16', '--caption_extension=.txt', '--cache_latents',
'--optimizer_type=AdamW8bit', '--clip_skip=2', '--bucket_reso_steps=64', '--xformers', '--bucket_no_upscale']' returned
non-zero exit status 1.
File config: 2.1_LoRA.txt