bitsandbytes-foundation / bitsandbytes

Accessible large language models via k-bit quantization for PyTorch.
https://huggingface.co/docs/bitsandbytes/main/en/index
MIT License
6.18k stars 620 forks source link

failed to train Lora and got these error: FileNotFoundError: Could not find module 'C:\Kohya\Kohya_ss\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cuda116.dll' (or one of its dependencies). Try using the full path with constructor syntax. #161

Closed SilveonSenpai closed 9 months ago

SilveonSenpai commented 1 year ago

FileNotFoundError: Could not find module 'C:\Kohya\Kohya_ss\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cuda116.dll' (or one of its dependencies). Try using the full path with constructor syntax. Traceback (most recent call last): File "C:\pyton\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\pyton\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "C:\Kohya\Kohya_ss\venv\Scripts\accelerate.exe__main__.py", line 7, in File "C:\Kohya\Kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main args.func(args) File "C:\Kohya\Kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command simple_launcher(args) File "C:\Kohya\Kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['C:\Kohya\Kohya_ss\venv\Scripts\python.exe', 'train_db.py', '--enable_bucket', '--pretrained_model_name_or_path=E:/stable/stable-diffusion-webui/models/Stable-diffusion/grapefruitHentaiModel_grapefruitv4.safetensors', '--train_data_dir=C:/calamitas/Calamitas Lora/image', '--resolution=512,512', '--output_dir=C:/calamitas/Calamitas Lora/model', '--logging_dir=C:/calamitas/Calamitas Lora/log', '--save_model_as=safetensors', '--output_name=Calamitas', '--max_data_loader_n_workers=1', '--learning_rate=0.0001', '--lr_scheduler=constant', '--train_batch_size=1', '--max_train_steps=1496', '--save_every_n_epochs=1', '--mixed_precision=no', '--save_precision=fp16', '--seed=1234', '--caption_extension=.txt', '--cache_latents', '--optimizer_type=AdamW', '--max_data_loader_n_workers=1', '--clip_skip=2', '--bucket_reso_steps=64', '--mem_eff_attn', '--gradient_checkpointing', '--use_8bit_adam', '--bucket_no_upscale']' returned non-zero exit status 1.

I checked, and this file is actualy in folder

apoot9999 commented 1 year ago

I encountered the same problem, I fix this by reinstall my windows python environment.

FinaBro69 commented 1 year ago

I have the same problem as you, but I know how to solve it....

go to (Training parameters) you can see is (AdamW8bit change to AdamW) and (Gradient checkpointing, Memory efficient attention) is two also tick it

Hope it helps you

github-actions[bot] commented 10 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.