Closed madrooky closed 4 months ago
Same issue here on windows 10
dreambooth version d8e01e1fd13dd4b6afadbb0382875f905ee5f429
version: v1.6.0
python: 3.10.6
torch: 2.1.0.dev20230712+cu118
xformers: N/A
The following directories listed in your path were found to be non-existent: {WindowsPath('tmp/restart')} CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching in backup paths... The following directories listed in your path were found to be non-existent: {WindowsPath('/usr/local/cuda/lib64')} DEBUG: Possible options found for libcudart.so: set() CUDA SETUP: PyTorch settings found: CUDA_VERSION=118, Highest Compute Capability: 8.6. CUDA SETUP: To manually override the PyTorch CUDA version please see:https://github.com/TimDettmers/bitsandbytes/blob/main/how_to_use_nonpytorch_cuda.md CUDA SETUP: Loading binary C:\Users\wasa4\stable-diffusion-webui\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cuda118.so... argument of type 'WindowsPath' is not iterable CUDA SETUP: Problem: The main issue seems to be that the main CUDA runtime library was not detected. CUDA SETUP: Solution 1: To solve the issue the libcudart.so location needs to be added to the LD_LIBRARY_PATH variable CUDA SETUP: Solution 1a): Find the cuda runtime library via: find / -name libcudart.so 2>/dev/null CUDA SETUP: Solution 1b): Once the library is found add it to the LD_LIBRARY_PATH: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:FOUND_PATH_FROM_1a CUDA SETUP: Solution 1c): For a permanent solution add the export from 1b into your .bashrc file, located at ~/.bashrc CUDA SETUP: Solution 2: If no library was found in step 1a) you need to install CUDA. CUDA SETUP: Solution 2a): Download CUDA install script: wget https://github.com/TimDettmers/bitsandbytes/blob/main/cuda_install.sh CUDA SETUP: Solution 2b): Install desired CUDA version to desired location. The syntax is bash cuda_install.sh CUDA_VERSION PATH_TO_INSTALL_INTO. CUDA SETUP: Solution 2b): For example, "bash cuda_install.sh 113 ~/local/" will download CUDA 11.3 and install into the folder ~/local
I've had the same issues, among others. The extension is completely nonfunctional at this time, at least for training SDXL models. I was somehow able to bypass the above issue, but training still wouldn't work. I suspect the case is the same for 2.0 and 1.5.
I've had the same issues, among others. The extension is completely nonfunctional at this time, at least for training SDXL models. I was somehow able to bypass the above issue, but training still wouldn't work. I suspect the case is the same for 2.0 and 1.5.
Ah yes, i was training with 1.5.
Bitsandbytes was not supported windows before, but my method can support windows.(yuhuang) 1 open folder J:\StableDiffusion\sdwebui,Click the address bar of the folder and enter CMD or WIN+R, CMD 。enter,cd /d J:\StableDiffusion\sdwebui 2 J:\StableDiffusion\sdwebui\py310\python.exe -m pip uninstall bitsandbytes
3 J:\StableDiffusion\sdwebui\py310\python.exe -m pip uninstall bitsandbytes-windows
4 J:\StableDiffusion\sdwebui\py310\python.exe -m pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.41.1-py3-none-win_amd64.whl
Replace your SD venv directory file(python.exe Folder) here(J:\StableDiffusion\sdwebui\py310)
OR you are Linux distribution (Ubuntu, MacOS, etc.)system ,AND CUDA Version: 11.X.
Bitsandbytes can support ubuntu.(yuhuang) 1 open folder J:\StableDiffusion\sdwebui,Click the address bar of the folder and enter CMD or WIN+R, CMD 。enter,cd /d J:\StableDiffusion\sdwebui 2 J:\StableDiffusion\sdwebui\py310\python.exe -m pip uninstall bitsandbytes
3 J:\StableDiffusion\sdwebui\py310\python.exe -m pip uninstall bitsandbytes-windows
4 J:\StableDiffusion\sdwebui\py310\python.exe -m pip install https://github.com/TimDettmers/bitsandbytes/releases/download/0.41.0/bitsandbytes-0.41.0-py3-none-any.whl
Replace your SD venv directory file(python.exe Folder) here(J:\StableDiffusion\sdwebui\py310)
Hello all, i got the same issues here (Automatic1111+sd_dreambooth_ext). I think the main problem is
PermissionError: [WinError 5] Zugriff verweigert: 'S:\\SD_core\\stable-diffusion-webui\\models\\Lora\\test_210.safetensors'
I have try swumagic comment and try reinstall the bitsandbytes-windows. But same PersmissionError happen. Afte i check the python code, i found the issues is in stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\diff_lora_to_sd_lora.py
line 120
os.remove(path)
Basic on the code, my problem is the python cannot remove the file, so i replace the code to this (don't delete file but just rename it):
safetensors.torch.save_file(new_model_dict, conv_path, metadata=metadata)
# Delete the file at path, move the new file to path
import datetime
current_datetime=datetime.datetime.now()
str_date = current_datetime.strftime("%Y-%m-%d_%H%M%S")
os.rename(path, path+""+str_date+".temp")
os.rename(conv_path, path)
and now no error popup. This is just a workaround and hope can be fixed gracefully in the feature.
I can't get to my PC for logs but I'd like to add that the latest Dreambooth on Auto1111 appears completely broken.
Since fresh install of Auto1111 and dreambooth yesterday:
1) I am unable to create Lora in Automatic1111 with Dreambooth extension, with the same error as above. (PermissionError: [WinError 5] 'C:\... etc)
update: Can confirm the above code fix from 'donlinglok' has worked to resolve this temporarily). The resulting lora still looks burnt though, see problem 2 below.
2) Any model I make with Dreambooth is broken. Any classification images I generate look horrible and 'overcooked' without having done any training, or done anything expect for clicking 'create model' with any 1.5 model I've tried.
3) I cannot make a Dreambooth model at all without having an internet connection. I think it might be trying to check huggingface or something and failing if it can't?
I'm quite new to Dreambooth so could be making an error somewhere but others seem to be having the same issues. Would hugely appreciate some fixes!
I can try to provide more logs or info if useful.
Thanks
Hi, thanks for the information, actually i am newbie on dreambooth and i have difficulty to tune the training param. Just for your reference now i am using this on train: https://github.com/Akegarasu/lora-scripts
I can't get to my PC for logs but I'd like to add that the latest Dreambooth on Auto1111 appears completely broken.
Since fresh install of Auto1111 and dreambooth yesterday:
- I am unable to create Lora in Automatic1111 with Dreambooth extension, with the same error as above. (PermissionError: [WinError 5] 'C:... etc)
update: Can confirm the above code fix from 'donlinglok' has worked to resolve this temporarily). The resulting lora still looks burnt though, see problem 2 below.
- Any model I make with Dreambooth is broken. Any classification images I generate look horrible and 'overcooked' without having done any training, or done anything expect for clicking 'create model' with any 1.5 model I've tried.
- I cannot make a Dreambooth model at all without having an internet connection. I think it might be trying to check huggingface or something and failing if it can't?
I'm quite new to Dreambooth so could be making an error somewhere but others seem to be having the same issues. Would hugely appreciate some fixes!
I can try to provide more logs or info if useful.
Thanks
I have problem too. When I'm trying to resume training of localy saved LoRa, I get PermissionError. Apparently, it is thinking my Lora file is in Huggingface. I also get occasional WinError 5
This issue is stale because it has been open 5 days with no activity. Remove stale label or comment or this will be closed in 5 days
This is still an issue for me. I've tried reverting to 1.0.14 (ala #1405 ), 3 fresh installs, the bitsandbytes fix in #1371 , and this fix. I've also made sure the folder wasn't read-only, made SD run as an admin, tried a venv and other python envs.
This issue is stale because it has been open 5 days with no activity. Remove stale label or comment or this will be closed in 5 days
Still having this issue on a fresh install of A1111 and the dreambooth extension. Both are up to date, but the issue still occurs.
Saving weights/samples...: : 0it [00:00, ?it/s] Traceback (most recent call last): | 0/1 [00:00<?, ?it/s]
File "F:\StableDiffusionProjects\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\ui_functions.py", line 735, in start_training
result = main(class_gen_method=class_gen_method)
File "F:\StableDiffusionProjects\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\train_dreambooth.py", line 1917, in main
return inner_loop()
File "F:\StableDiffusionProjects\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\memory.py", line 126, in decorator
return function(batch_size, grad_size, prof, *args, **kwargs)
File "F:\StableDiffusionProjects\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\train_dreambooth.py", line 1874, in inner_loop
check_save(True)
File "F:\StableDiffusionProjects\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\train_dreambooth.py", line 1027, in check_save
save_weights(
File "F:\StableDiffusionProjects\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\train_dreambooth.py", line 1433, in save_weights
convert_diffusers_to_kohya_lora(lora_save_file, meta, args.lora_weight)
File "F:\StableDiffusionProjects\stable-diffusion-webui\extensions\sd_dreambooth_extension\dreambooth\diff_lora_to_sd_lora.py", line 120, in convert_diffusers_to_kohya_lora
os.remove(path)
PermissionError: [WinError 5] Access is denied: 'F:\\StableDiffusionProjects\\stable-diffusion-webui\\models\\Lora\\lora_Yuley-v1.3.4_I105H_900.safetensors'
The issue happens in convert_diffusers_to_kohya_lora() when it tries to remove old lora file and replace it with a new one. os.remove() throws an exception. I suspect that what happens is someone occupying the lora file. As a result, os.remove() causes WinError 5. It is unclear, if the file handle is not closed, or there is a race condition with another process. I bypassed it like this:
# Delete the file at path, move the new file to path
try:
os.remove(path)
os.rename(conv_path, path)
except:
pass
This issue is stale because it has been open 5 days with no activity. Remove stale label or comment or this will be closed in 5 days
Is there an existing issue for this?
What happened?
Yesterday I updated dreambooth and since then various thinks broke. I got it running again but there are still errors. But most importantly, I can't continue training Lora pt's. It tries to load it from a hugging face rep.... silly...
Might be able to change that manually? I am looking through the settings but I am not even sure where and what to look for.
Note, that I made a separate fresh install for troubleshooting. There is no problem in a fresh install until i install dreambooth and then things fall apart. Spent hours yesterday and today trying to fix it.
I just dropped the entire log as far as it reaches back from my current console. It runs but it is insane how utterly broken it is. And once i think i have solved one thing another one replaces it.
Steps to reproduce the problem
install SD, install dreambooth, try to use it in any way....
Commit and libraries
Dreambooth revision: 1a1d1621086a4725fda1200256f319c845dc7a8a [!] xformers version 0.0.20 installed. [+] torch version 2.0.1+cu118 installed. [+] torchvision version 0.15.2+cu118 installed. [+] accelerate version 0.23.0 installed. [+] diffusers version 0.21.4 installed. [+] transformers version 4.32.1 installed. [+] bitsandbytes version 0.41.1 installed.
Command Line Arguments
Console logs
Additional information
No response