bmaltais / kohya_ss

Apache License 2.0
9.54k stars 1.23k forks source link

returned non-zero exit status 1 #1727

Closed BlinkerHigh closed 8 months ago

BlinkerHigh commented 10 months ago

I keep getting the same error: "raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['d:\kohya_ss\venv\Scripts\python.exe', './sdxl_train_network.py', '--enable_bucket', '--min_bucket_reso=256', '--max_bucket_reso=2048', '--pretrained_model_name_or_path=stabilityai/stable-diffusion-xl-base-1.0', '--train_data_dir=D:/Art/_ML datasets/Style/Destination\img', '--resolution=1024,1024', '--output_dir=D:/Art/_ML datasets/Style/Destination\model', '--logging_dir=D:/Art/_ML datasets/Style/Destination\log', '--network_alpha=1', '--training_comment=trigger: STYLE', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=0.00012', '--unet_lr=0.0012', '--network_dim=128', '--gradient_accumulation_steps=2', '--output_name=STYLE', '--lr_scheduler_num_cycles=1', '--no_half_vae', '--learning_rate=0.0012', '--lr_scheduler=constant', '--train_batch_size=1', '--max_train_steps=58200', '--save_every_n_epochs=1', '--mixed_precision=bf16', '--save_precision=bf16', '--caption_extension=.txt', '--cache_latents', '--cache_latents_to_disk', '--optimizer_type=AdamW', '--optimizer_args', 'scale_parameter=False', 'relative_step=False', 'warmup_init=False', '--max_data_loader_n_workers=0', '--bucket_reso_steps=64', '--gradient_checkpointing', '--xformers', '--bucket_no_upscale', '--noise_offset=0.0']' returned non-zero exit status 1."

Any idea how to solve this issue?

Feeling-z commented 10 months ago

+1 same error,hepl!!tks!

Feeling-z commented 10 months ago

I keep getting the same error: "raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['d:\kohya_ss\venv\Scripts\python.exe', './sdxl_train_network.py', '--enable_bucket', '--min_bucket_reso=256', '--max_bucket_reso=2048', '--pretrained_model_name_or_path=stabilityai/stable-diffusion-xl-base-1.0', '--train_data_dir=D:/Art/_ML datasets/Style/Destination\img', '--resolution=1024,1024', '--output_dir=D:/Art/_ML datasets/Style/Destination\model', '--logging_dir=D:/Art/_ML datasets/Style/Destination\log', '--network_alpha=1', '--training_comment=trigger: STYLE', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=0.00012', '--unet_lr=0.0012', '--network_dim=128', '--gradient_accumulation_steps=2', '--output_name=STYLE', '--lr_scheduler_num_cycles=1', '--no_half_vae', '--learning_rate=0.0012', '--lr_scheduler=constant', '--train_batch_size=1', '--max_train_steps=58200', '--save_every_n_epochs=1', '--mixed_precision=bf16', '--save_precision=bf16', '--caption_extension=.txt', '--cache_latents', '--cache_latents_to_disk', '--optimizer_type=AdamW', '--optimizer_args', 'scale_parameter=False', 'relative_step=False', 'warmup_init=False', '--max_data_loader_n_workers=0', '--bucket_reso_steps=64', '--gradient_checkpointing', '--xformers', '--bucket_no_upscale', '--noise_offset=0.0']' returned non-zero exit status 1."

Any idea how to solve this issue?

I already took care of it,Someone commented elsewhere that running Setup can resolve the bitsandbytes issues.You can have a try.#1692

BlinkerHigh commented 10 months ago

I keep getting the same error: "raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['d:\kohya_ss\venv\Scripts\python.exe', './sdxl_train_network.py', '--enable_bucket', '--min_bucket_reso=256', '--max_bucket_reso=2048', '--pretrained_model_name_or_path=stabilityai/stable-diffusion-xl-base-1.0', '--train_data_dir=D:/Art/_ML datasets/Style/Destination\img', '--resolution=1024,1024', '--output_dir=D:/Art/_ML datasets/Style/Destination\model', '--logging_dir=D:/Art/_ML datasets/Style/Destination\log', '--network_alpha=1', '--training_comment=trigger: STYLE', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=0.00012', '--unet_lr=0.0012', '--network_dim=128', '--gradient_accumulation_steps=2', '--output_name=STYLE', '--lr_scheduler_num_cycles=1', '--no_half_vae', '--learning_rate=0.0012', '--lr_scheduler=constant', '--train_batch_size=1', '--max_train_steps=58200', '--save_every_n_epochs=1', '--mixed_precision=bf16', '--save_precision=bf16', '--caption_extension=.txt', '--cache_latents', '--cache_latents_to_disk', '--optimizer_type=AdamW', '--optimizer_args', 'scale_parameter=False', 'relative_step=False', 'warmup_init=False', '--max_data_loader_n_workers=0', '--bucket_reso_steps=64', '--gradient_checkpointing', '--xformers', '--bucket_no_upscale', '--noise_offset=0.0']' returned non-zero exit status 1." Any idea how to solve this issue?

I already took care of it,Someone commented elsewhere that running Setup can resolve the bitsandbytes issues.You can have a try.#1692

What do you mean by running setup?

BlinkerHigh commented 10 months ago

I just reinstalled Kohya SS and I still get the same error...

Feeling-z commented 10 months ago

I just reinstalled Kohya SS and I still get the same error...

  1. Start PowerShell with administrator rights: Set-ExecutionPolicy RemoteSigned, then close the window.

  2. Non-administrators start PowerShell: Switch to the folder directory (cd X:\X\kohya_ss), without switching if you are on drive C. a. Create a venv virtual environment file: python -m venv venv b. Start the virtual environment: venv\Scripts\activate.ps1

  3. activate activate.bat: .\venv\Scripts\activate.bat

  4. Uninstall bitsandbytespip: pip uninstall bitsandbytes Y

  5. Run the following command to install bitsandbytes: python -m pip install bitsandbytes --prefer-binary --extra-index-url=https://jllllll.github.io/bitsandbytes-windows-webui

CRCODE22 commented 10 months ago

5. bitsandbytes: python -m pip install bitsandbytes --prefer-binary --extra-index-url=https://jllllll.github.io/bitsandbytes-windows-webui

Does not work I am still getting the following error: E:\kohya_ss\venv\lib\site-packages\torch\utils\checkpoint.py:31: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn("None of the inputs have requires_grad=True. Gradients will be None")

Feeling-z commented 10 months ago
  1. bitsandbytes: python -m pip install bitsandbytes --prefer-binary --extra-index-url=https://jllllll.github.io/bitsandbytes-windows-webui

Does not work I am still getting the following error: E:\kohya_ss\venv\lib\site-packages\torch\utils\checkpoint.py:31: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn("None of the inputs have requires_grad=True. Gradients will be None")

https://github.com/jllllll/bitsandbytes-windows-webui

Feeling-z commented 10 months ago

image

Rojinski commented 8 months ago

Hello,

I've tried every solution but it doesn't work... Still the same problem... Textual Inversion or Lora, training doesn't work. I really need help, someone... :) Thanks on advance.

`11:17:43-666063 INFO Version: v22.6.0

11:17:43-670830 INFO nVidia toolkit detected 11:17:44-971307 INFO Torch 2.1.2+cu118 11:17:44-988147 INFO Torch backend: nVidia CUDA 11.8 cuDNN 8700 11:17:44-989112 INFO Torch detected GPU: NVIDIA GeForce RTX 3070 Ti VRAM 8192 Arch (8, 6) Cores 48 11:17:44-990149 INFO Verifying modules installation status from requirements_windows_torch2.txt... 11:17:44-992149 INFO Installing package: torch==2.1.2+cu118 torchvision==0.16.2+cu118 torchaudio==2.1.2+cu118 --index-url https://download.pytorch.org/whl/cu118 11:17:48-096708 INFO Installing package: xformers==0.0.23.post1+cu118 --index-url https://download.pytorch.org/whl/cu118 11:17:49-891744 INFO Verifying modules installation status from requirements.txt... 11:17:49-894655 WARNING Package wrong version: gradio 3.36.1 required 3.44.0 11:17:49-895655 INFO Installing package: gradio==3.44.0 11:17:59-816991 INFO headless: False 11:17:59-818991 INFO Load CSS... Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch(). 11:21:51-279574 INFO Start training TI... 11:21:51-296412 INFO Valid image folder names found in: E:/Images/AI/Training/Characters/Roji_Clover/images 11:21:51-297413 INFO Folder 25_Roji_Clover: 3750 steps 11:21:51-298412 INFO max_train_steps = 18750 11:21:51-299412 INFO stop_text_encoder_training = 0 11:21:51-300413 INFO lr_warmup_steps = 1875 11:21:51-301412 INFO Saving training config to E:/Images/AI/Training/Characters/Roji_Clover/model\RojiClover_20240207-112151.json... 11:21:51-551664 INFO accelerate launch --num_cpu_threads_per_process=2 "./train_textual_inversion.py" --enable_bucket --min_bucket_reso=256 --max_bucket_reso=2048 --pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5" --train_data_dir="E:/Images/AI/Training/Characters/Roji_Clover/images" --resolution="768,5768" --output_dir="E:/Images/AI/Training/Characters/Roji_Clover/model" --logging_dir="E:/Images/AI/Training/Characters/Roji_Clover/logs" --save_model_as=safetensors --output_name="RojiClover" --lr_scheduler_num_cycles="5" --max_data_loader_n_workers="0" --learning_rate="1e-05" --lr_scheduler="cosine" --lr_warmup_steps="1875" --train_batch_size="1" --max_train_steps="18750" --save_every_n_epochs="1" --mixed_precision="fp16" --save_precision="fp16" --cache_latents --optimizer_type="AdamW8bit" --max_data_loader_n_workers="0" --bucket_reso_steps=64 --xformers --bucket_no_upscale --noise_offset=0.0 --token_string="woman" --init_word="" --num_vectors_per_token=14 A matching Triton is not available, some optimizations will not be enabled. Error caught was: No module named 'triton' prepare tokenizer prepare accelerator loading model for process 0/1 load Diffusers pretrained models: runwayml/stable-diffusion-v1-5 Loading pipeline components...: 100%|████████████████████████████████████████████████████| 5/5 [00:00<00:00, 12.70it/s] You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing safety_checker=None. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 . UNet2DConditionModel: 64, 8, 768, False, False U-Net converted to original U-Net Traceback (most recent call last): File "D:\Programmes 2\Logiciels\kohya_ss-master\train_textual_inversion.py", line 797, in trainer.train(args) File "D:\Programmes 2\Logiciels\kohya_ss-master\train_textual_inversion.py", line 225, in train num_added_tokens == args.num_vectors_per_token AssertionError: tokenizer has same word to token string. please use another one / 指定したargs.token_stringは既に存在し ます。別の単語を使ってください: tokenizer 1, woman Traceback (most recent call last): File "C:\Users\thier\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\thier\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "D:\Programmes 2\Logiciels\kohya_ss-master\venv\Scripts\accelerate.exe__main__.py", line 7, in File "D:\Programmes 2\Logiciels\kohya_ss-master\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 47, in main args.func(args) File "D:\Programmes 2\Logiciels\kohya_ss-master\venv\lib\site-packages\accelerate\commands\launch.py", line 1017, in launch_command simple_launcher(args) File "D:\Programmes 2\Logiciels\kohya_ss-master\venv\lib\site-packages\accelerate\commands\launch.py", line 637, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['D:\Programmes 2\Logiciels\kohya_ss-master\venv\Scripts\python.exe', './train_textual_inversion.py', '--enable_bucket', '--min_bucket_reso=256', '--max_bucket_reso=2048', '--pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5', '--train_data_dir=E:/Images/AI/Training/Characters/Roji_Clover/images', '--resolution=768,5768', '--output_dir=E:/Images/AI/Training/Characters/Roji_Clover/model', '--logging_dir=E:/Images/AI/Training/Characters/Roji_Clover/logs', '--save_model_as=safetensors', '--output_name=RojiClover', '--lr_scheduler_num_cycles=5', '--max_data_loader_n_workers=0', '--learning_rate=1e-05', '--lr_scheduler=cosine', '--lr_warmup_steps=1875', '--train_batch_size=1', '--max_train_steps=18750', '--save_every_n_epochs=1', '--mixed_precision=fp16', '--save_precision=fp16', '--cache_latents', '--optimizer_type=AdamW8bit', '--max_data_loader_n_workers=0', '--bucket_reso_steps=64', '--xformers', '--bucket_no_upscale', '--noise_offset=0.0', '--token_string=woman', '--init_word=', '--num_vectors_per_token=14']' returned non-zero exit status 1. `