Closed LearnDude closed 9 months ago
I'm having a similar issue. I have Automatic1111 running just fine on my RTX 4090. I have setup a separate venv to use with kohya-ss and when I try to start Training, it starts out fine and then when bitsandbytes kicks in, it fails to setup CUDA.
Here are my system details:
kohya_ss git:(master) ✗ neofetch
⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⢰⡆⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄ tokenwizard@OfficePC
⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⢠⣿⣿⡄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄
⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⢀⣾⣿⣿⣿⡀⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄ · Archcraft x86_64
⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⣼⣿⣿⣿⣿⣷⡀⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄ · 6.5.5-zen1-1-zen
⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⣼⣿⣿⣿⣿⣿⣿⣷⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄ · 4 days, 23 hours, 27 mins
⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⢼⣿⣿⣿⣿⣿⣿⣿⣿⣧⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄ · 1254 (pacman)
⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⣰⣤⣈⠻⢿⣿⣿⣿⣿⣿⣿⣧⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄ · zsh 5.9
⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⣰⣿⣿⣿⣿⣮⣿⣿⣿⣿⣿⣿⣿⣧⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄ · 3440x1440
⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⣰⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣧⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄ · Openbox
⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⣰⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣧⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄ · Adapta-Nokto
⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⣼⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣧⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄ · Arc-Dark [GTK2/3]
⠄⠄⠄⠄⠄⠄⠄⠄⠄⣼⣿⣿⣿⣿⣿⡿⣿⣿⡟⠄⠄⠸⣿⣿⡿⣿⣿⣿⣿⣿⣷⡀⠄⠄⠄⠄⠄⠄⠄⠄ · Arc-Circle [GTK2/3]
⠄⠄⠄⠄⠄⠄⠄⠄⣼⣿⣿⣿⣿⣿⡏⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠈⣿⣿⣿⣿⣿⣷⡀⠄⠄⠄⠄⠄⠄⠄ · xfce4-terminal
⠄⠄⠄⠄⠄⠄⢀⣼⣿⣿⣿⣿⣿⣿⡗⠄⠄⠄⢀⣠⣤⣀⠄⠄⠄⠸⣿⣿⣿⣿⣿⣿⣷⡀⠄⠄⠄⠄⠄⠄ · AMD Ryzen 7 3800X (16) @ 3.900GHz
⠄⠄⠄⠄⠄⢀⣾⣿⣿⣿⣿⣿⡏⠁⠄⠄⠄⢠⣿⣿⣿⣿⡇⠄⠄⠄⠄⢙⣿⣿⣻⠿⣿⣷⡀⠄⠄⠄⠄⠄ · NVIDIA GeForce RTX 4090
⠄⠄⠄⠄⢀⣾⣿⣿⣿⣿⣿⣿⣷⣤⡀⠄⠄⠄⠻⣿⣿⡿⠃⠄⠄⠄⢀⣼⣿⣿⣿⣿⣦⣌⠙⠄⠄⠄⠄⠄ · 18517MiB / 64226MiB
⠄⠄⠄⢠⣾⣿⣿⣿⣿⣿⣿⣿⣿⣿⠏⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⢿⣿⣿⣿⣿⣿⣿⣿⣿⣦⡀⠄⠄⠄
⠄⠄⢠⣿⣿⣿⣿⣿⣿⣿⡿⠟⠋⠁⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠙⠻⣿⣿⣿⣿⣿⣿⣿⣿⡄⠄⠄
⠄⣠⣿⣿⣿⣿⠿⠛⠋⠁⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠉⠙⠻⢿⣿⣿⣿⣿⣆⠄
⡰⠟⠛⠉⠁⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠄⠉⠙⠛⠿⢆
And here is the full output from kohya-ss from the start of the Training to the error:
13:06:24-525298 INFO Loading config...
13:06:26-206578 INFO Start training Dreambooth...
13:06:26-208214 INFO Valid image folder names found in: /home/tokenwizard/tmp/lmd/lora/img
13:06:26-209695 INFO Folder 100_lmd : steps 5000
13:06:26-210956 INFO max_train_steps (5000 / 4 / 1 * 1 * 1) = 1250
13:06:26-212455 INFO stop_text_encoder_training = 0
13:06:26-213773 INFO lr_warmup_steps = 0
13:06:26-215031 INFO Saving training config to /home/tokenwizard/tmp/lmd/lora/LMD_RealisticVision51_v1_20231101-130626.json...
13:06:26-216684 INFO accelerate launch --num_cpu_threads_per_process=2 "./train_db.py" --enable_bucket --min_bucket_reso=256 --max_bucket_reso=2048 --pretrained_model_name_or_path="/home/tokenwizard/stable-diffusion-webui/models/Stable-diffusion/1.0
- realisticVisionV51_v51VAE.safetensors" --train_data_dir="/home/tokenwizard/tmp/lmd/lora/img" --resolution="512,512" --output_dir="/home/tokenwizard/tmp/lmd/lora" --logging_dir="/home/tokenwizard/tmp/lmd/lora/logs"
--save_model_as=safetensors --output_name="LMD_RealisticVision51_v1" --lr_scheduler_num_cycles="1" --max_data_loader_n_workers="1" --learning_rate="0.0001" --lr_scheduler="constant" --train_batch_size="4"
--max_train_steps="1250" --save_every_n_epochs="1" --mixed_precision="bf16" --save_precision="bf16" --caption_extension=".txt" --cache_latents --optimizer_type="AdamW8bit" --max_data_loader_n_workers="1" --clip_skip=2
--keep_tokens="1" --bucket_reso_steps=64 --shuffle_caption --xformers --bucket_no_upscale --noise_offset=0.0
2023-11-01 13:06:29.500732: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-11-01 13:06:29.500772: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-11-01 13:06:29.500807: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-11-01 13:06:29.507459: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-11-01 13:06:30.150140: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
prepare tokenizer
prepare images.
found directory /home/tokenwizard/tmp/lmd/lora/img/100_lmd contains 50 image files
5000 train images with repeating.
0 reg images.
no regularization images / 正則化画像が見つかりませんでした
[Dataset 0]
batch_size: 4
resolution: (512, 512)
enable_bucket: True
min_bucket_reso: 256
max_bucket_reso: 2048
bucket_reso_steps: 64
bucket_no_upscale: True
[Subset 0 of Dataset 0]
image_dir: "/home/tokenwizard/tmp/lmd/lora/img/100_lmd"
image_count: 50
num_repeats: 100
shuffle_caption: True
keep_tokens: 1
caption_dropout_rate: 0.0
caption_dropout_every_n_epoches: 0
caption_tag_dropout_rate: 0.0
caption_prefix: None
caption_suffix: None
color_aug: False
flip_aug: False
face_crop_aug_range: None
random_crop: False
token_warmup_min: 1,
token_warmup_step: 0,
is_reg: False
class_tokens: lmd
caption_extension: .txt
[Dataset 0]
loading image sizes.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:00<00:00, 10607.75it/s]
make buckets
min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is set, because bucket reso is defined by image size automatically / bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計算されるため、min_bucket_resoとmax_bucket_resoは無視されます
number of images (including repeats) / 各bucketの画像枚数(繰り返し回数を含む)
bucket 0: resolution (384, 512), count: 800
bucket 1: resolution (448, 448), count: 600
bucket 2: resolution (448, 512), count: 400
bucket 3: resolution (512, 448), count: 100
bucket 4: resolution (512, 512), count: 3100
mean ar error (without repeats): 0.013517200506358285
prepare accelerator
loading model for process 0/1
load StableDiffusion checkpoint: /home/tokenwizard/stable-diffusion-webui/models/Stable-diffusion/1.0 - realisticVisionV51_v51VAE.safetensors
UNet2DConditionModel: 64, 8, 768, False, False
loading u-net: <All keys matched successfully>
loading vae: <All keys matched successfully>
loading text encoder: <All keys matched successfully>
Enable xformers for U-Net
[Dataset 0]
caching latents.
checking cache validity...
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:00<00:00, 1109604.23it/s]
caching latents...
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:01<00:00, 27.35it/s]
prepare optimizer, data loader etc.
False
===================================BUG REPORT===================================
/home/tokenwizard/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:166: UserWarning: Welcome to bitsandbytes. For bug reports, please run
python -m bitsandbytes
warn(msg)
================================================================================
The following directories listed in your path were found to be non-existent: {PosixPath('/home/tokenwizard/kohya_ss/venv/lib/python3.10/site-packages/cv2/../../lib64')}
/home/tokenwizard/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:166: UserWarning: Found duplicate ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] files: {PosixPath('/opt/cuda/targets/x86_64-linux/lib/libcudart.so'), PosixPath('/opt/cuda/lib64/libcudart.so')}.. We select the PyTorch default libcudart.so, which is {torch.version.cuda},but this might missmatch with the CUDA version that is needed for bitsandbytes.To override this behavior set the BNB_CUDA_VERSION=<version string, e.g. 122> environmental variableFor example, if you want to use the CUDA version 122BNB_CUDA_VERSION=122 python ...OR set the environmental variable in your .bashrc: export BNB_CUDA_VERSION=122In the case of a manual override, make sure you set the LD_LIBRARY_PATH, e.g.export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.2
warn(msg)
/home/tokenwizard/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:166: UserWarning: /home/tokenwizard/kohya_ss/venv/lib/python3.10/site-packages/cv2/../../lib64::/opt/cuda/lib64:/opt/cuda/targets/x86_64-linux/lib did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] as expected! Searching further paths...
warn(msg)
The following directories listed in your path were found to be non-existent: {PosixPath('/org/freedesktop/DisplayManager/Session1')}
The following directories listed in your path were found to be non-existent: {PosixPath('/usr/${LIB}/libgtk3-nocsd.so.0')}
The following directories listed in your path were found to be non-existent: {PosixPath('/org/freedesktop/DisplayManager/Seat0')}
The following directories listed in your path were found to be non-existent: {PosixPath('https'), PosixPath('//debuginfod.archlinux.org ')}
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching in backup paths...
The following directories listed in your path were found to be non-existent: {PosixPath('/usr/local/cuda/lib64')}
DEBUG: Possible options found for libcudart.so: set()
CUDA SETUP: PyTorch settings found: CUDA_VERSION=118, Highest Compute Capability: 8.9.
CUDA SETUP: To manually override the PyTorch CUDA version please see:https://github.com/TimDettmers/bitsandbytes/blob/main/how_to_use_nonpytorch_cuda.md
CUDA SETUP: Loading binary /home/tokenwizard/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda118.so...
libcusparse.so.11: cannot open shared object file: No such file or directory
CUDA SETUP: Problem: The main issue seems to be that the main CUDA runtime library was not detected.
CUDA SETUP: Solution 1: To solve the issue the libcudart.so location needs to be added to the LD_LIBRARY_PATH variable
CUDA SETUP: Solution 1a): Find the cuda runtime library via: find / -name libcudart.so 2>/dev/null
CUDA SETUP: Solution 1b): Once the library is found add it to the LD_LIBRARY_PATH: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:FOUND_PATH_FROM_1a
CUDA SETUP: Solution 1c): For a permanent solution add the export from 1b into your .bashrc file, located at ~/.bashrc
CUDA SETUP: Solution 2: If no library was found in step 1a) you need to install CUDA.
CUDA SETUP: Solution 2a): Download CUDA install script: wget https://github.com/TimDettmers/bitsandbytes/blob/main/cuda_install.sh
CUDA SETUP: Solution 2b): Install desired CUDA version to desired location. The syntax is bash cuda_install.sh CUDA_VERSION PATH_TO_INSTALL_INTO.
CUDA SETUP: Solution 2b): For example, "bash cuda_install.sh 113 ~/local/" will download CUDA 11.3 and install into the folder ~/local
Traceback (most recent call last):
File "/home/tokenwizard/kohya_ss/./train_db.py", line 488, in <module>
train(args)
File "/home/tokenwizard/kohya_ss/./train_db.py", line 171, in train
_, _, optimizer = train_util.get_optimizer(args, trainable_params)
File "/home/tokenwizard/kohya_ss/library/train_util.py", line 3419, in get_optimizer
import bitsandbytes as bnb
File "/home/tokenwizard/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/__init__.py", line 6, in <module>
from . import cuda_setup, utils, research
File "/home/tokenwizard/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/research/__init__.py", line 1, in <module>
from . import nn
File "/home/tokenwizard/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/research/nn/__init__.py", line 1, in <module>
from .modules import LinearFP8Mixed, LinearFP8Global
File "/home/tokenwizard/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/research/nn/modules.py", line 8, in <module>
from bitsandbytes.optim import GlobalOptimManager
File "/home/tokenwizard/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/optim/__init__.py", line 6, in <module>
from bitsandbytes.cextension import COMPILED_WITH_CUDA
File "/home/tokenwizard/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 20, in <module>
raise RuntimeError('''
RuntimeError:
CUDA Setup failed despite GPU being available. Please run the following command to get more information:
python -m bitsandbytes
Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues
Traceback (most recent call last):
File "/home/tokenwizard/kohya_ss/venv/bin/accelerate", line 8, in <module>
sys.exit(main())
File "/home/tokenwizard/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 47, in main
args.func(args)
File "/home/tokenwizard/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 986, in launch_command
simple_launcher(args)
File "/home/tokenwizard/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 628, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/tokenwizard/kohya_ss/venv/bin/python3.10', './train_db.py', '--enable_bucket', '--min_bucket_reso=256', '--max_bucket_reso=2048', '--pretrained_model_name_or_path=/home/tokenwizard/stable-diffusion-webui/models/Stable-diffusion/1.0 - realisticVisionV51_v51VAE.safetensors', '--train_data_dir=/home/tokenwizard/tmp/lmd/lora/img', '--resolution=512,512', '--output_dir=/home/tokenwizard/tmp/lmd/lora', '--logging_dir=/home/tokenwizard/tmp/lmd/lora/logs', '--save_model_as=safetensors', '--output_name=LMD_RealisticVision51_v1', '--lr_scheduler_num_cycles=1', '--max_data_loader_n_workers=1', '--learning_rate=0.0001', '--lr_scheduler=constant', '--train_batch_size=4', '--max_train_steps=1250', '--save_every_n_epochs=1', '--mixed_precision=bf16', '--save_precision=bf16', '--caption_extension=.txt', '--cache_latents', '--optimizer_type=AdamW8bit', '--max_data_loader_n_workers=1', '--clip_skip=2', '--keep_tokens=1', '--bucket_reso_steps=64', '--shuffle_caption', '--xformers', '--bucket_no_upscale', '--noise_offset=0.0']' returned non-zero exit status 1.
I tried this suggestion from the instructions in the output to set the CUDA version to 118 which matches what we are using for kohya-ss, and also exported my LD_LIBRARY_PATH:
kohya_ss git:(master) ✗ echo $LD_LIBRARY_PATH
:/opt/cuda/lib64:/opt/cuda/targets/x86_64-linux/lib
kohya_ss git:(master) ✗ echo $BNB_CUDA_VERSION
118
As you can see, the libraries are there in the path:
kohya_ss git:(master) ✗ ls /opt/cuda/targets/x86_64-linux/lib | grep libcudart.so
libcudart.so
libcudart.so.12
libcudart.so.12.2.53
So I'm unclear why bitsandbytes is complaining about the libraries not being there:
/home/tokenwizard/kohya_ss/venv/lib/python3.10/site-packages/cv2/../../lib64::/opt/cuda/lib64:/opt/cuda/targets/x86_64-linux/lib did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] as expected! Searching further paths...
Here is the full Bug Report output form bitsandbytes:
(KohyaSS) kohya_ss git:(master) ✗ python -m bitsandbytes
False
===================================BUG REPORT===================================
/home/tokenwizard/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:166: UserWarning: Welcome to bitsandbytes. For bug reports, please run
python -m bitsandbytes
warn(msg)
================================================================================
/home/tokenwizard/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:166: UserWarning: Found duplicate ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] files: {PosixPath('/opt/cuda/targets/x86_64-linux/lib/libcudart.so'), PosixPath('/opt/cuda/lib64/libcudart.so')}.. We select the PyTorch default libcudart.so, which is {torch.version.cuda},but this might missmatch with the CUDA version that is needed for bitsandbytes.To override this behavior set the BNB_CUDA_VERSION=<version string, e.g. 122> environmental variableFor example, if you want to use the CUDA version 122BNB_CUDA_VERSION=122 python ...OR set the environmental variable in your .bashrc: export BNB_CUDA_VERSION=122In the case of a manual override, make sure you set the LD_LIBRARY_PATH, e.g.export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.2
warn(msg)
/home/tokenwizard/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:166: UserWarning: :/opt/cuda/lib64:/opt/cuda/targets/x86_64-linux/lib did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] as expected! Searching further paths...
warn(msg)
The following directories listed in your path were found to be non-existent: {PosixPath('/org/freedesktop/DisplayManager/Session1')}
The following directories listed in your path were found to be non-existent: {PosixPath('/org/freedesktop/DisplayManager/Seat0')}
The following directories listed in your path were found to be non-existent: {PosixPath('//debuginfod.archlinux.org '), PosixPath('https')}
The following directories listed in your path were found to be non-existent: {PosixPath('/usr/${LIB}/libgtk3-nocsd.so.0')}
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching in backup paths...
The following directories listed in your path were found to be non-existent: {PosixPath('/usr/local/cuda/lib64')}
DEBUG: Possible options found for libcudart.so: set()
CUDA SETUP: PyTorch settings found: CUDA_VERSION=118, Highest Compute Capability: 8.9.
CUDA SETUP: To manually override the PyTorch CUDA version please see:https://github.com/TimDettmers/bitsandbytes/blob/main/how_to_use_nonpytorch_cuda.md
CUDA SETUP: Loading binary /home/tokenwizard/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda118.so...
libcusparse.so.11: cannot open shared object file: No such file or directory
CUDA SETUP: Problem: The main issue seems to be that the main CUDA runtime library was not detected.
CUDA SETUP: Solution 1: To solve the issue the libcudart.so location needs to be added to the LD_LIBRARY_PATH variable
CUDA SETUP: Solution 1a): Find the cuda runtime library via: find / -name libcudart.so 2>/dev/null
CUDA SETUP: Solution 1b): Once the library is found add it to the LD_LIBRARY_PATH: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:FOUND_PATH_FROM_1a
CUDA SETUP: Solution 1c): For a permanent solution add the export from 1b into your .bashrc file, located at ~/.bashrc
CUDA SETUP: Solution 2: If no library was found in step 1a) you need to install CUDA.
CUDA SETUP: Solution 2a): Download CUDA install script: wget https://github.com/TimDettmers/bitsandbytes/blob/main/cuda_install.sh
CUDA SETUP: Solution 2b): Install desired CUDA version to desired location. The syntax is bash cuda_install.sh CUDA_VERSION PATH_TO_INSTALL_INTO.
CUDA SETUP: Solution 2b): For example, "bash cuda_install.sh 113 ~/local/" will download CUDA 11.3 and install into the folder ~/local
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 187, in _run_module_as_main
mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
File "/usr/lib/python3.10/runpy.py", line 146, in _get_module_details
return _get_module_details(pkg_main_name, error)
File "/usr/lib/python3.10/runpy.py", line 110, in _get_module_details
__import__(pkg_name)
File "/home/tokenwizard/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/__init__.py", line 6, in <module>
from . import cuda_setup, utils, research
File "/home/tokenwizard/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/research/__init__.py", line 1, in <module>
from . import nn
File "/home/tokenwizard/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/research/nn/__init__.py", line 1, in <module>
from .modules import LinearFP8Mixed, LinearFP8Global
File "/home/tokenwizard/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/research/nn/modules.py", line 8, in <module>
from bitsandbytes.optim import GlobalOptimManager
File "/home/tokenwizard/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/optim/__init__.py", line 6, in <module>
from bitsandbytes.cextension import COMPILED_WITH_CUDA
File "/home/tokenwizard/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 20, in <module>
raise RuntimeError('''
RuntimeError:
CUDA Setup failed despite GPU being available. Please run the following command to get more information:
python -m bitsandbytes
Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues
Bitsandbytes was not supported windows before, but my method can support windows.(yuhuang) 1 open folder J:\StableDiffusion\sdwebui,Click the address bar of the folder and enter CMD or WIN+R, CMD 。enter,cd /d J:\StableDiffusion\sdwebui 2 J:\StableDiffusion\sdwebui\py310\python.exe -m pip uninstall bitsandbytes
3 J:\StableDiffusion\sdwebui\py310\python.exe -m pip uninstall bitsandbytes-windows
4 J:\StableDiffusion\sdwebui\py310\python.exe -m pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.41.1-py3-none-win_amd64.whl
Replace your SD venv directory file(python.exe Folder) here(J:\StableDiffusion\sdwebui\py310)
OR you are Linux distribution (Ubuntu, MacOS, etc.)system ,AND CUDA Version: 11.X.
Bitsandbytes can support ubuntu.(yuhuang) 1 open folder J:\StableDiffusion\sdwebui,Click the address bar of the folder and enter CMD or WIN+R, CMD 。enter,cd /d J:\StableDiffusion\sdwebui 2 J:\StableDiffusion\sdwebui\py310\python.exe -m pip uninstall bitsandbytes
3 J:\StableDiffusion\sdwebui\py310\python.exe -m pip uninstall bitsandbytes-windows
4 J:\StableDiffusion\sdwebui\py310\python.exe -m pip install https://github.com/TimDettmers/bitsandbytes/releases/download/0.41.0/bitsandbytes-0.41.0-py3-none-any.whl
Replace your SD venv directory file(python.exe Folder) here(J:\StableDiffusion\sdwebui\py310)
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Below is the output from python -m bitsandbytes. Under it I show that libcudart exists in the folder. I'm not sure how to proceed. I've tried using the BNB_CUDA_VERSION env variable set to "", "12", "120", and "121". Torch is using 12.1 so that's what bitsandbytes should ideally use.
EDIT: added diagnostics from Torch as well.