bmaltais / kohya_ss

Apache License 2.0
9.54k stars 1.23k forks source link

Lora dont work with Adam8bit [solved] #1778

Closed BIoHazaaard closed 8 months ago

BIoHazaaard commented 10 months ago

01:53:46-720084 INFO Save... 01:53:48-266961 INFO Save... 01:53:48-439073 INFO Save... 01:53:48-644371 INFO Save... 01:53:48-816073 INFO Save... 01:54:04-568074 INFO Start training Dreambooth... 01:54:04-570075 INFO Valid image folder names found in: D:\Photo 01:54:04-571074 INFO Folder 100_Mepya : steps 1500 01:54:04-571074 INFO max_train_steps (1500 / 2 / 1 1 1) = 750 01:54:04-572074 INFO stop_text_encoder_training = 0 01:54:04-573074 INFO lr_warmup_steps = 75 01:54:04-573074 INFO Saving training config to D:\Photo\last_20231218-015404.json... 01:54:04-574074 INFO accelerate launch --num_cpu_threads_per_process=2 "./train_db.py" --enable_bucket --min_bucket_reso=256 --max_bucket_reso=2048 --pretrained_model_name_or_path="D:/SD/stable-diffusion-webui/models/Stable-diffusion/majicmixF antasy_v30Vae.safetensors" --train_data_dir="D:\Photo" --resolution="512,512" --output_dir="D:\Photo" --save_model_as=safetensors --output_name="last" --lr_scheduler_num_cycles="1" --max_data_loader_n_workers="0" --learning_rate_te="0.0001" --learning_rate="0.0001" --lr_scheduler="cosine" --lr_warmup_steps="75" --train_batch_size="2" --max_train_steps="750" --save_every_n_epochs="1" --mixed_precision="bf16" --save_precision="bf16" --seed="1" --cache_latents --optimizer_type="AdamW8bit" --max_data_loader_n_workers="0" --bucket_reso_steps=64 --xformers --bucket_no_upscale --noise_offset=0.0 The following values were not passed to accelerate launch and had defaults used instead: --num_processes was set to a value of 1 --num_machines was set to a value of 1 --mixed_precision was set to a value of 'no' --dynamo_backend was set to a value of 'no' To avoid this warning pass in values for each of the problematic parameters or run accelerate config. prepare tokenizer ignore directory without repeats / 繰り返し回数のないディレクトリを無視します: sample prepare images. found directory D:\Photo\100_Mepya contains 15 image files No caption file found for 15 images. Training will continue without captions for these images. If class token exists, it will be used. / 15枚の画像にキャプションファイルが見つかりませんでした。これらの画像についてはキャプションなしで学習を 続行します。class tokenが存在する場合はそれを使います。 D:\Photo\100_Mepya\2.png D:\Photo\100_Mepya\DSC_6063.png D:\Photo\100_Mepya\DSC_6073.png D:\Photo\100_Mepya\DSC_6075.png D:\Photo\100_Mepya\DSC_6107.png D:\Photo\100_Mepya\DSC_6138.png... and 10 more 1500 train images with repeating. 0 reg images. no regularization images / 正則化画像が見つかりませんでした [Dataset 0] batch_size: 2 resolution: (512, 512) enable_bucket: True min_bucket_reso: 256 max_bucket_reso: 2048 bucket_reso_steps: 64 bucket_no_upscale: True

[Subset 0 of Dataset 0] image_dir: "D:\Photo\100_Mepya" image_count: 15 num_repeats: 100 shuffle_caption: False keep_tokens: 0 caption_dropout_rate: 0.0 caption_dropout_every_n_epoches: 0 caption_tag_dropout_rate: 0.0 caption_prefix: None caption_suffix: None color_aug: False flip_aug: False face_crop_aug_range: None random_crop: False token_warmup_min: 1, token_warmup_step: 0, is_reg: False class_tokens: Mepya caption_extension: .caption

[Dataset 0] loading image sizes. 100%|████████████████████████████████████████████████████████████████████████████████| 15/15 [00:00<00:00, 3000.22it/s] make buckets min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is set, because bucket reso is defined by image size automatically / bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計算されるため、min_bucket_resoとmax_bucket_resoは無視されます number of images (including repeats) / 各bucketの画像枚数(繰り返し回数を含む) bucket 0: resolution (512, 512), count: 1500 mean ar error (without repeats): 0.0 prepare accelerator loading model for process 0/1 load StableDiffusion checkpoint: D:/SD/stable-diffusion-webui/models/Stable-diffusion/majicmixFantasy_v30Vae.safetensors UNet2DConditionModel: 64, 8, 768, False, False loading u-net: loading vae: loading text encoder: Enable xformers for U-Net [Dataset 0] caching latents. checking cache validity... 100%|██████████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:00<?, ?it/s] caching latents... 100%|██████████████████████████████████████████████████████████████████████████████████| 15/15 [00:01<00:00, 11.23it/s] prepare optimizer, data loader etc. Traceback (most recent call last): File "D:\SD\lora\kohya_ss\library\train_util.py", line 3444, in get_optimizer import bitsandbytes as bnb File "D:\SD\lora\kohya_ss\venv\lib\site-packages\bitsandbytes__init.py", line 6, in from . import cuda_setup, utils, research File "D:\SD\lora\kohya_ss\venv\lib\site-packages\bitsandbytes\research__init__.py", line 1, in from . import nn File "D:\SD\lora\kohya_ss\venv\lib\site-packages\bitsandbytes\research\nn\init.py", line 1, in from .modules import LinearFP8Mixed, LinearFP8Global File "D:\SD\lora\kohya_ss\venv\lib\site-packages\bitsandbytes\research\nn\modules.py", line 8, in from bitsandbytes.optim import GlobalOptimManager File "D:\SD\lora\kohya_ss\venv\lib\site-packages\bitsandbytes\optim\init__.py", line 6, in from bitsandbytes.cextension import COMPILED_WITH_CUDA File "D:\SD\lora\kohya_ss\venv\lib\site-packages\bitsandbytes\cextension.py", line 5, in from .cuda_setup.main import evaluate_cuda_setup File "D:\SD\lora\kohya_ss\venv\lib\site-packages\bitsandbytes\cuda_setup\main.py", line 21, in from .paths import determine_cuda_runtime_lib_path ModuleNotFoundError: No module named 'bitsandbytes.cuda_setup.paths'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "D:\SD\lora\kohya_ss\train_db.py", line 495, in train(args) File "D:\SD\lora\kohya_ss\traindb.py", line 181, in train , _, optimizer = train_util.get_optimizer(args, trainable_params) File "D:\SD\lora\kohya_ss\library\train_util.py", line 3446, in get_optimizer raise ImportError("No bitsandbytes / bitsandbytesがインストールされていないようです") ImportError: No bitsandbytes / bitsandbytesがインストールされていないようです Traceback (most recent call last): File "C:\Users\Ultra\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\Ultra\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "D:\SD\lora\kohya_ss\venv\Scripts\accelerate.exe__main__.py", line 7, in File "D:\SD\lora\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 47, in main args.func(args) File "D:\SD\lora\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 986, in launch_command simple_launcher(args) File "D:\SD\lora\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 628, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['D:\SD\lora\kohya_ss\venv\Scripts\python.exe', './train_db.py', '--enable_bucket', '--min_bucket_reso=256', '--max_bucket_reso=2048', '--pretrained_model_name_or_path=D:/SD/stable-diffusion-webui/models/Stable-diffusion/majicmixFantasy_v30Vae.safetensors', '--train_data_dir=D:\Photo', '--resolution=512,512', '--output_dir=D:\Photo', '--save_model_as=safetensors', '--output_name=last', '--lr_scheduler_num_cycles=1', '--max_data_loader_n_workers=0', '--learning_rate_te=0.0001', '--learning_rate=0.0001', '--lr_scheduler=cosine', '--lr_warmup_steps=75', '--train_batch_size=2', '--max_train_steps=750', '--save_every_n_epochs=1', '--mixed_precision=bf16', '--save_precision=bf16', '--seed=1', '--cache_latents', '--optimizer_type=AdamW8bit', '--max_data_loader_n_workers=0', '--bucket_reso_steps=64', '--xformers', '--bucket_no_upscale', '--noise_offset=0.0']' returned non-zero exit status 1.

Syscore64 commented 10 months ago

ModuleNotFoundError: No module named 'bitsandbytes.cuda_setup.paths'

Install Bitsandbyte in your venv - or run setup.bat

brianiup commented 10 months ago

ModuleNotFoundError: No module named 'bitsandbytes.cuda_setup.paths'

Install Bitsandbyte in your venv - or run setup.bat

Doesn't help, see my similar issue: https://github.com/bmaltais/kohya_ss/issues/1766

Syscore64 commented 10 months ago

strange - i use bitsandbytes-0.41.2.post2-py3-none-win_amd64 and works fine with Windows 11 Python 3.10 and RTX3070

BIoHazaaard commented 10 months ago

strange - i use bitsandbytes-0.41.2.post2-py3-none-win_amd64 and works fine with Windows 11 Python 3.10 and RTX3070

Problem was in powershell. Fresh win11 dont allow use unlicensed scripts, so i change execute policy, reinstall LoRA and everything goes well

BIoHazaaard commented 10 months ago

ModuleNotFoundError: No module named 'bitsandbytes.cuda_setup.paths'

Install Bitsandbyte in your venv - or run setup.bat

Doesn't help, see my similar issue: #1766

Problem was in powershell. Fresh win11 dont allow use unlicensed scripts, so i change execute policy, reinstall LoRA and everything goes well

Firedan1176 commented 10 months ago

Try opening a new powershell window, activating the venv: ./venv/Scripts/activate Make note of which version of bitsandbytes you have: pip show bitsandbytes Uninstalling bitsandbytes: pip uninstall bitsandbytes And installing bitsandbytes-windows-webui. I'd recommend trying to install the same version you took note of, instead of just using the latest: python -m pip install bitsandbytes==<version> --prefer-binary --extra-index-url=https://jllllll.github.io/bitsandbytes-windows-webui

karachay-b commented 10 months ago

For me worked:

./venv/Scripts/activate.ps1 with powershell

pip uninstall bitsandbytes

pip uninstall bitsandbytes-windows

Then I downloaded this wheel: bitsandbytes-0.41.2.post2-py3-none-win_amd64.whl

Placed it in ./venv/Scripts/

Finally run: ./venv/Scripts/activate.ps1 with powershell and pip install bitsandbytes-0.41.2.post2-py3-none-win_amd64.whl

TheZaind commented 9 months ago

or what worked for me if youre on windows: change in the "requirements_windows_torch2.tx"t the "bitsandbytes==0.41.1 # no_verify" to "bitsandbytes-windows". it can be that you need to delete the bitsandbytes folder from the venv.