(Linux) (3090) AdamW8bit not working (Ubuntu 22.04.2 LTS x86_64)

BarfingLemurs commented 1 year ago

I am unable to run (AdamW 8Bit) on linux with my 3090. Training with AdamW works fine. My commit on that repo is https://github.com/bmaltais/kohya_ss/commit/9c8c480f8e654eeb5a7d92c13b4ce04333840b0c

I install with the sudo ./setup.sh command

I use the gui addition, which user have reported an issue, but with an older card: https://github.com/bmaltais/kohya_ss/issues/485

Thank you for any help provided!

Name: bitsandbytes
Version: 0.35.0
Summary: 8-bit optimizers and matrix multiplication routines.
Home-page: https://github.com/TimDettmers/bitsandbytes
Author: Tim Dettmers
Author-email: dettmers@cs.washington.edu
License: MIT
Location: /home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages
Requires: 
Required-by:

log:

Validating that requirements are satisfied.
All requirements satisfied.
Load CSS...
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Loading config...
Loading config...
Folder 100_game : 200 steps
Folder 100_o_raptor : 100 steps
max_train_steps = 150
stop_text_encoder_training = 0
lr_warmup_steps = 15
accelerate launch --num_cpu_threads_per_process=2 "train_db.py" --enable_bucket --pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5" --train_data_dir="/home/ubuntu/TRAINING/img" --resolution=768,768 --output_dir="/home/ubuntu/TRAINING/model" --logging_dir="/home/ubuntu/TRAINING/log" --save_model_as=safetensors --output_name="last" --max_data_loader_n_workers="0" --learning_rate="1e-5" --lr_scheduler="cosine" --lr_warmup_steps="15" --train_batch_size="2" --max_train_steps="150" --save_every_n_epochs="1" --mixed_precision="bf16" --save_precision="bf16" --cache_latents --optimizer_type="AdamW8bit" --max_data_loader_n_workers="0" --bucket_reso_steps=64 --mem_eff_attn --xformers --bucket_no_upscale 
2023-04-04 00:36:21.031906: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-04-04 00:36:21.162673: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-04-04 00:36:21.541030: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory
2023-04-04 00:36:21.541069: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory
2023-04-04 00:36:21.541075: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
2023-04-04 00:36:22.518678: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-04-04 00:36:22.614710: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-04-04 00:36:22.938389: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory
2023-04-04 00:36:22.938426: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory
2023-04-04 00:36:22.938432: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
prepare tokenizer
prepare images.
found directory /home/ubuntu/TRAINING/img/100_game contains 2 image files
found directory /home/ubuntu/TRAINING/img/100_o_raptor contains 1 image files
300 train images with repeating.
0 reg images.
no regularization images / 正則化画像が見つかりませんでした
[Dataset 0]
  batch_size: 2
  resolution: (768, 768)
  enable_bucket: True
  min_bucket_reso: 256
  max_bucket_reso: 1024
  bucket_reso_steps: 64
  bucket_no_upscale: True

  [Subset 0 of Dataset 0]
    image_dir: "/home/ubuntu/TRAINING/img/100_game"
    image_count: 2
    num_repeats: 100
    shuffle_caption: False
    keep_tokens: 0
    caption_dropout_rate: 0.0
    caption_dropout_every_n_epoches: 0
    caption_tag_dropout_rate: 0.0
    color_aug: False
    flip_aug: False
    face_crop_aug_range: None
    random_crop: False
    token_warmup_min: 1,
    token_warmup_step: 0,
    is_reg: False
    class_tokens: game
    caption_extension: .caption

  [Subset 1 of Dataset 0]
    image_dir: "/home/ubuntu/TRAINING/img/100_o_raptor"
    image_count: 1
    num_repeats: 100
    shuffle_caption: False
    keep_tokens: 0
    caption_dropout_rate: 0.0
    caption_dropout_every_n_epoches: 0
    caption_tag_dropout_rate: 0.0
    color_aug: False
    flip_aug: False
    face_crop_aug_range: None
    random_crop: False
    token_warmup_min: 1,
    token_warmup_step: 0,
    is_reg: False
    class_tokens: o_raptor
    caption_extension: .caption

[Dataset 0]
loading image sizes.
100%|███████████████████████████████████████████| 3/3 [00:00<00:00, 1522.62it/s]
make buckets
min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is set, because bucket reso is defined by image size automatically / bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計算されるため、min_bucket_resoとmax_bucket_resoは無視されます
number of images (including repeats) / 各bucketの画像枚数（繰り返し回数を含む）
bucket 0: resolution (448, 320), count: 200
bucket 1: resolution (448, 384), count: 100
mean ar error (without repeats): 0.0492189063259297
prepare accelerator
Using accelerator 0.15.0 or above.
load Diffusers pretrained models
Fetching 15 files: 100%|█████████████████████| 15/15 [00:00<00:00, 62851.71it/s]
/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/transformers/models/clip/feature_extraction_clip.py:28: FutureWarning: The class CLIPFeatureExtractor is deprecated and will be removed in version 5 of Transformers. Please use CLIPImageProcessor instead.
  warnings.warn(
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing `safety_checker=None`. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .
Replace CrossAttention.forward to use FlashAttention (not xformers)
[Dataset 0]
caching latents.
100%|█████████████████████████████████████████████| 3/3 [00:00<00:00,  6.41it/s]
prepare optimizer, data loader etc.

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
For effortless bug reporting copy-paste your error into this form: https://docs.google.com/forms/d/e/1FAIpQLScPB8emS3Thkp66nvqwmjTEgxp8Y9ufuWTzFyr9kJ5AoI47dQ/viewform?usp=sf_link
================================================================================
/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/paths.py:27: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/cv2/../../lib64')}
  warn(
/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/paths.py:105: UserWarning: /home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/cv2/../../lib64: did not contain libcudart.so as expected! Searching further paths...
  warn(
/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/paths.py:27: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('@/tmp/.ICE-unix/2160,unix/ubuntu-MS-7C56'), PosixPath('local/ubuntu-MS-7C56')}
  warn(
/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/paths.py:27: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/org/gnome/Terminal/screen/0a7b15e0_4cb6_4d61_bb5c_164597cc5551')}
  warn(
/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/paths.py:27: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('1'), PosixPath('0')}
  warn(
/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/paths.py:27: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/etc/xdg/xdg-ubuntu')}
  warn(
/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/paths.py:27: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('unix'), PosixPath('path=/run/user/1000/bus,guid=a76448840dcf1813fd67431e642b64ba')}
  warn(
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...
/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/paths.py:27: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/usr/local/cuda/lib64')}
  warn(
WARNING: No libcudart.so found! Install CUDA or the cudatoolkit package (anaconda)!
CUDA SETUP: Loading binary /home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so...
/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cextension.py:48: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers and GPU quantization are unavailable.
  warn(
use 8-bit AdamW optimizer | {}
running training / 学習開始
  num train images * repeats / 学習画像の数×繰り返し回数: 300
  num reg images / 正則化画像の数: 0
  num batches per epoch / 1epochのバッチ数: 150
  num epochs / epoch数: 1
  batch size per device / バッチサイズ: 2
  total train batch size (with parallel & distributed & accumulation) / 総バッチサイズ（並列学習、勾配合計含む）: 2
  gradient ccumulation steps / 勾配を合計するステップ数 = 1
  total optimization steps / 学習ステップ数: 150
steps:   0%|                                            | 0/150 [00:00<?, ?it/s]epoch 1/1
Traceback (most recent call last):
  File "/home/ubuntu/kohya_ss/train_db.py", line 429, in <module>
    train(args)
  File "/home/ubuntu/kohya_ss/train_db.py", line 317, in train
    optimizer.step()
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/accelerate/optimizer.py", line 134, in step
    self.scaler.step(self.optimizer, closure)
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/torch/cuda/amp/grad_scaler.py", line 338, in step
    retval = self._maybe_opt_step(optimizer, optimizer_state, *args, **kwargs)
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/torch/cuda/amp/grad_scaler.py", line 285, in _maybe_opt_step
    retval = optimizer.step(*args, **kwargs)
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/torch/optim/lr_scheduler.py", line 65, in wrapper
    return wrapped(*args, **kwargs)
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/torch/optim/optimizer.py", line 113, in wrapper
    return func(*args, **kwargs)
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/optim/optimizer.py", line 265, in step
    self.update_step(group, p, gindex, pindex)
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/optim/optimizer.py", line 506, in update_step
    F.optimizer_update_8bit_blockwise(
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/functional.py", line 858, in optimizer_update_8bit_blockwise
    str2optimizer8bit_blockwise[optimizer_name][0](
NameError: name 'str2optimizer8bit_blockwise' is not defined
steps:   0%|                                            | 0/150 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/home/ubuntu/kohya_ss/venv/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 45, in main
    args.func(args)
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1104, in launch_command
    simple_launcher(args)
  File "/home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 567, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/ubuntu/kohya_ss/venv/bin/python', 'train_db.py', '--enable_bucket', '--pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5', '--train_data_dir=/home/ubuntu/TRAINING/img', '--resolution=768,768', '--output_dir=/home/ubuntu/TRAINING/model', '--logging_dir=/home/ubuntu/TRAINING/log', '--save_model_as=safetensors', '--output_name=last', '--max_data_loader_n_workers=0', '--learning_rate=1e-5', '--lr_scheduler=cosine', '--lr_warmup_steps=15', '--train_batch_size=2', '--max_train_steps=150', '--save_every_n_epochs=1', '--mixed_precision=bf16', '--save_precision=bf16', '--cache_latents', '--optimizer_type=AdamW8bit', '--max_data_loader_n_workers=0', '--bucket_reso_steps=64', '--mem_eff_attn', '--xformers', '--bucket_no_upscale']' returned non-zero exit status 1.

BarfingLemurs commented 1 year ago

(If someone tells me it's bad anyway, there is no need to fix, I just wanted to see if it's any good.)

kohya-ss commented 1 year ago

CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64... /home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/paths.py:27: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/usr/local/cuda/lib64')} warn( WARNING: No libcudart.so found! Install CUDA or the cudatoolkit package (anaconda)! CUDA SETUP: Loading binary /home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so... /home/ubuntu/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cextension.py:48: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers and GPU quantization are unavailable. warn(

I've not tested with Linux, but the error seemed to be related with CUDA installation. Please make sure that the appropriate version of CUDA installed.

ioritree commented 1 year ago

same but windows ,AdamW8bit not work CUDA ver:11.8 CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64... WARNING: No libcudart.so found! Install CUDA or the cudatoolkit package (anaconda)!

kohya-ss commented 1 year ago

In windows, some files in bitsandbytes must be replaced. Please follow the installation guide: https://github.com/kohya-ss/sd-scripts#windows-installation

BarfingLemurs commented 1 year ago

I will look for a solution, thanks

kohya-ss / sd-scripts

(Linux) (3090) AdamW8bit not working (Ubuntu 22.04.2 LTS x86_64) #375