Closed patryk-bartkowiak-nitid closed 5 months ago
Thanks for the detailed thread. Can you pin me a version that was working as expected for you?
I am asking because none of those scripts went through significant logical changes in the past 7 days.
Yeah that's the thing, I am unable to restore the environment perfectly and I'm blocked right now, not sure where the issue is :/
Ah then it's a bit of a pity. In any case, please do ping me here if you're able to give me a pinpointed version. I am happy to look further from there :-)
Anyway going through README guide it's not working properly, I am happy to meet or whatever to solve this issue :)
README guide? Do you mean the commands from https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/README_sdxl.md don't work? Can you provide a fully reproducible snippet for me?
I am happy to meet or whatever to solve this issue :)
Sorry, we cannot do that. As maintainers, we need to be cognizant of our time and keep the discussions as open as possible,
I mean command from https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/README_sdxl.md combined with https://github.com/huggingface/diffusers/blob/main/scripts/convert_diffusers_sdxl_lora_to_webui.py
Not sure on what part of the pipeline there is an issue, like I said I am able to use LoRA using code for inference that you provided in README, but can't correctly convert it. Might be both the conversion itself or LoRA has some different properties that conversion script can't handle.
Let me send you full pipeline for you to reproduce the issue, I will try to include as many details as possible:
pytorch/pytorch:2.0.0-cuda11.7-cudnn8-devel
apt update
apt install vim git tmux ffmpeg libsm6 libxext6 wget python3 python3-venv libgl1 libglib2.0-0 google-perftools -y
git clone https://github.com/huggingface/diffusers.git cd diffusers pip install -e . cd examples/dreambooth pip install -r requirements.txt accelerate config default pip install bitsandbytes xformers==0.0.19
3. Download baseline SDXL model:
wget https://civitai.com/api/download/models/333449 -O DreamShaperXL.safetensors
4. Convert `.safetensors` to suitable format using python:
import diffusers pipe = diffusers.StableDiffusionXLPipeline.from_single_file("DreamShaperXL.safetensors") pipe.save_pretrained("DreamShaperXL")
5. Train LoRA (6 images with the same woman on white background):
export MODEL_NAME="DreamShaperXL" export INSTANCE_DIR="data/claire" export MAX_TRAIN_STEPS=5000 export CHECKPOINTING_STEPS=500
export OUTPUT_DIR="outputs/$(basename ${MODELNAME})$(basename ${INSTANCE_DIR})" export CUDA_LAUNCH_BLOCKING=1 export TORCH_USE_CUDA_DSA=1
printf "\n\nTraining Claire model with $MODEL_NAME on $INSTANCE_DIR, saving to $OUTPUT_DIR\n\n"
accelerate launch diffusers/examples/dreambooth/train_dreambooth_lora_sdxl.py \ --instance_prompt="photo of wff woman, isolated on white background" \ --pretrained_model_name_or_path=$MODEL_NAME \ --instance_data_dir=$INSTANCE_DIR \ --output_dir=$OUTPUT_DIR \ --resolution=1024 \ --train_batch_size=2 \ --gradient_accumulation_steps=4 \ --learning_rate=1e-4 \ --lr_scheduler="constant" \ --lr_warmup_steps=0 \ --max_train_steps=$MAX_TRAIN_STEPS \ --seed="0" \ --train_text_encoder \ --enable_xformers_memory_efficient_attention \ --gradient_checkpointing \ --use_8bit_adam \ --checkpointing_steps=$CHECKPOINTING_STEPS
6. Convert to Kohya format:
python /diffusers/scripts/convert_diffusers_sdxl_lora_to_webui.py outputs/DreamShaperXL_claire/pytorch_lora_weights.safetensors test.safetensors
7. Move to A1111:
mv test.safetensors stable-diffusion-webui/models/Lora/
As mentioned I need to know a version that was working as expected for you.
CC: @linoytsaban @apolinario here.
Well because I can't really provide it - can we just focus on the current version that is probably not working properly?
I was also considering A1111 to not work, but I am able to work with my previous LoRA's so I think it has to be something in this pipeline
That makes it thousand times more difficult for us to make progress here actually, hence I am a bit adamant on it. To be able to pinpoint the issue -- can we say the trained LoRA provides expected results when the inference is done from diffusers
?
Your initial issue description suggests so. So, I quite suspect that it's the conversion script that's the culprit here.
Yes, LoRA provides expected results when the inference is done from diffusers.
When it's done in A1111 it actually changes the output image (same seed), but not in a way that it should, looks like its just adding some noise at the beginning of the generation process. I will send an example in 3 minutes
Then it's quite likely that the conversion script is the problem as mentioned. So, I will let @apolinario and @linoytsaban comment further (as they are the developers of that script).
A1111 Config:
photo of wff woman, rides gondola in Venice,
Negative prompt: text, watermark, low quality, medium quality, blurry, censored, wrinkles, deformed, mutated text, watermark, low quality, medium quality, blurry, censored, wrinkles, deformed, mutated, BadDream, UnrealisticDream
Steps: 7, Sampler: DPM++ SDE Karras, CFG scale: 2, Seed: 420, Size: 1024x1024, Model hash: 676f0d60c8, Model: DreamShaperXL, Version: v1.7.0
Image without any LoRA: Image with previously trained LoRA that works - trained for 8000 iterations with batch_size=1: Image with new LoRA - trained for 4000 iterations with batch_size=2:
Also adding an image generated locally with new LoRA that doesn't work in A1111 - trained for 4000 iterations with batch_size=2
import torch
from diffusers import DiffusionPipeline
pretrained_model = "DreamShaperXL"
lora_weights = "./outputs/DreamShaperXL_claire/checkpoint-4000/"
prompt = "photo of wff woman, rides gondola in Venice,"
negative_prompt = "text, watermark, low quality, medium quality, blurry, censored, wrinkles, deformed, mutated text, watermark, low quality, medium quality, blurry, censored, wrinkles, deformed, mutated"
pipe = DiffusionPipeline.from_pretrained(pretrained_model, torch_dtype=torch.float32)
pipe = pipe.to("cuda")
pipe.load_lora_weights(lora_weights)
image = pipe(
prompt=prompt,
negative_prompt=negative_prompt,
num_inference_steps=50,
seed=420,
).images[0]
image.save("lora_inference.png")
As you can see it's much closer, of course quality is not good enough because in AUTOMATIC1111 there are some additional things that make it look better like negative embeddings etc.
I tried to load exact same model after the conversion in ComfyUI and it works properly, but I found this issue from a week ago: https://github.com/huggingface/diffusers/issues/6777
Do you think it's related? Did any of LoRA keys changed? Looks like A1111 do not support it yet
Could be related but the LoRA keys didn’t change. We have got multiple tests ensuring that.
Hey @patryk-bartkowiak-nitid, thanks for creating this issue! Just to make sure I understand, right now comfyUI conversion works fine but A111 doesn't?
Hey @patryk-bartkowiak-nitid, thanks for creating this issue! Just to make sure I understand, right now comfyUI conversion works fine but A111 doesn't?
Exactly
Hmm, I'm not sure what have caused this since we haven't made any changes to the conversion script, and the changes made to the training script should not affect that. @sayakpaul was there any change in the peft keys maybe that would make the conversion script incompatible?
No, I don’t think so. There were no changes to the training script or the underlying utils that would lead to key incompatibilities.
Could this have had an impact? https://github.com/huggingface/diffusers/pull/6895
Pretty sure not as it only touches the model card which has nothing to do with the state dict.
Any ideas @sayakpaul @linoytsaban ? Still trying to figure this out
Sorry but I don't work with A1111 or ComfyUI either. And I cannot offer any help related to conversion to non-diffusers formats right now.
@patryk-bartkowiak-nitid can you check the state_dict
of the previous Loras that worked fine on A1111 and the new ones and see if there are differences (assuming there are if it's incompatible) and what are they?
I compared converted .safetensors
files and already worked on restoring the exact same structure, this is how I restored it so you can see the difference between them:
before = load_file("claire.safetensors")
after = load_file("test.safetensors")
for k in after.keys():
v = after[k]
del after[k]
k = k.replace("lora.down", "lora_down")
k = k.replace("lora.up", "lora_up")
k = k.replace("to_k_lora", "to_k.lora")
k = k.replace("_lora_down", ".lora_down")
k = k.replace("_lora_up", ".lora_up")
after[k] = v
for layer_name in [x for x in after.keys() if x.endswith("lora_up.weight")]:
layer_name = layer_name.replace("lora_up.weight", "alpha")
layer_name = layer_name.replace("_alpha", ".alpha")
after[layer_name] = torch.tensor(4)
Now I got two .safetensors
files with exact same keys and shapes, but different values in weights ofc
intersection = set(before.keys()) & set(after.keys())
len(before), len(after), len(intersection)
(2208, 1648, 528)
intersection = set(before.keys()) & set(after.keys())
len(before), len(after), len(intersection)
(2208, 2208, 2208)
I also encountered the same problem
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
@sayakpaul is this the fix? https://github.com/huggingface/diffusers/pull/7435
Yeah could be.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
assuming fixed in https://github.com/huggingface/diffusers/pull/7435 let us know if it is still an issue, and we will reopen this!
Describe the bug
I have been using train_dreambooth_lora_sdxl.py and convert_diffusers_sdxl_lora_to_webui.py to train LoRA for specific character, It was working till like a week ago. I am using the same baseline model and the same data.
I realized that previous size of all the LoRA files had 29967176 bytes, now it has 29889672 and less keys in dict after I load it as pure
.safetensors
file.I realized that it works fine with inference guide in README:
But after I convert and load to A1111 (it loads correctly) it doesnt work anymore, looks like its adding some noise to the output only.
I already tried checkpointing to previous commits on
diffusers
,torch
andtorchvision
, but nothing really helps. I am still not able to use LoRA in A1111.Reproduction
Code to train LoRA:
Code to convert to A1111 format
Logs
System Info
Who can help?
@yiyixuxu @sayakpaul @DN6 @patrickvonplaten