kohya-ss / sd-webui-additional-networks

GNU Affero General Public License v3.0
1.78k stars 296 forks source link

How to use lora trained with sd-scripts with Diffusers? #148

Open kopyl opened 1 year ago

kopyl commented 1 year ago

Or any other tool to generate images programmatically in python without installing webIUI.

kohya-ss commented 1 year ago

sd-scripts repo has a script to generate images called gen_img_diffusers.py. https://github.com/kohya-ss/sd-scripts

Unfortunately the document for the script is in Japanese and outdated, but I think it will help (please ignore the installation process in the document). https://note.com/kohya_ss/n/n2693183a798e

kopyl commented 1 year ago

@kohya-ss i'm unable to get the same result as with your LoRA in WebUI. Same settings, different results.

Plus i need some kind of a pipeline, so i don't load it each time i generate an image as with the script.

kopyl commented 1 year ago

There is no info on your LoRA on https://note.com/kohya_ss/n/n2693183a798e :(

kohya-ss commented 1 year ago

Unfortunately there is no pipeline for LoRA currently...

Sorry for the lack of the documentation. The command line with LoRA is like followings.

single LoRA:

python gen_img_diffusers.py --ckpt name_to_ckpt.safetensors --n_iter 1 --scale 8 --steps 40 --outdir txt2img/samples --xformers --W 512 --H 512 --fp16 --sampler k_euler_a --network_module networks.lora --network_weights lora1.safetensors --network_mul 1.0 --max_embeddings_multiples 3 --clip_skip 2 --batch_size 1 --images_per_prompt 1 --prompt "beautiful scene --n negative prompt"

two LoRAs:

python gen_img_diffusers.py --ckpt name_to_ckpt.safetensors --n_iter 1 --scale 8 --steps 40 --outdir txt2img/samples --xformers --W 512 --H 512 --fp16 --sampler k_euler_a --network_module networks.lora networks.lora --network_weights lora1.safetensors lora2.safetensors --network_mul 1.0 0.8 --max_embeddings_multiples 3 --clip_skip 2 --batch_size 1 --images_per_prompt 1 --prompt "beautiful scene --n negative prompt"
kopyl commented 1 year ago

@kohya-ss no problem, I was able to run it with LoRA. Thanks for your answers :)

  1. Any way to get the same result as with your LoRA in WebUI from CLI?
  2. Any way to get the same result as with your LoRA in WebUI from Python like with the Diffusers library?

I tried it but the result is not the same as in WebUI :(

Currently, I'm using WebUI API to programmatically make generation, but I want to get rid of this crap in favor of more stable solutions like Diffusers...

kopyl commented 1 year ago

@kohya-ss hey :)

kohya-ss commented 1 year ago

Sorry for the delay. I'm trying to figure out how to easily infer with the standard pipeline of Diffusers. Please wait a while.

kopyl commented 1 year ago

@kohya-ss no problem, thank you very much for the update, you're the best ❤

kopyl commented 1 year ago

@kohya-ss or maybe not so standard... Something which does not require a heavy AUTOMATIC1111 to do the inference.

I was trying to reverse engineer this, but had no luck https://github.com/replicate/lora-inference :(

kohya-ss commented 1 year ago

Hi, I wrote a document on how to use LoRA with Diffusers standard pipe. I will publish it soon, but this is a current copy. I think you can use any arguments for the pipe, and Long Prompt Weighting Stable Diffusion for the weighting.

I hope this helps you.

import torch
from diffusers import StableDiffusionPipeline
from networks.lora import LoRAModule, create_network_from_weights
from safetensors.torch import load_file

# if the ckpt is CompVis based, convert it to Diffusers beforehand with tools/convert_diffusers20_original_sd.py. See --help for more details.

model_id_or_dir = r"model_id_on_hugging_face_or_dir"
device = "cuda"

# create pipe
print(f"creating pipe from {model_id_or_dir}...")
pipe = StableDiffusionPipeline.from_pretrained(model_id_or_dir, revision="fp16", torch_dtype=torch.float16)
pipe = pipe.to(device)
vae = pipe.vae
text_encoder = pipe.text_encoder
unet = pipe.unet

# load lora networks
print(f"loading lora networks...")

lora_path1 = r"lora1.safetensors"
sd = load_file(lora_path1)   # If the file is .ckpt, use torch.load instead.
network1, sd = create_network_from_weights(0.5, None, vae, text_encoder,unet, sd)
network1.apply_to(text_encoder, unet)
network1.load_state_dict(sd)
network1.to(device, dtype=torch.float16)

# # You can merge weights instead of apply_to+load_state_dict. network.set_multiplier does not work
# network.merge_to(text_encoder, unet, sd)

lora_path2 = r"lora2.safetensors"
sd = load_file(lora_path2) 
network2, sd = create_network_from_weights(0.7, None, vae, text_encoder,unet, sd)
network2.apply_to(text_encoder, unet)
network2.load_state_dict(sd)
network2.to(device, dtype=torch.float16)

lora_path3 = r"lora3.safetensors"
sd = load_file(lora_path3)
network3, sd = create_network_from_weights(0.5, None, vae, text_encoder,unet, sd)
network3.apply_to(text_encoder, unet)
network3.load_state_dict(sd)
network3.to(device, dtype=torch.float16)

# prompts
prompt = "masterpiece, best quality, 1girl, in white shirt, looking at viewer"
negative_prompt = "bad quality, worst quality, bad anatomy, bad hands"

# exec pipe
print("generating image...")
with torch.autocast("cuda"):
    image = pipe(prompt, guidance_scale=7.5, negative_prompt=negative_prompt).images[0]

# if not merged, you can use set_multiplier
# network1.set_multiplier(0.8)
# and generate image again...

# save image
image.save(r"by_diffusers..png")
kopyl commented 1 year ago

@kohya-ss thanks a lot. Could you please tell me how to install the dependencies which do not conflict with each other?

Especially for these lines:

from networks.lora import LoRAModule, create_network_from_weights from safetensors.torch import load_file

kopyl commented 1 year ago

@kohya-ss i tried it with trial and errors. It's muuuuch better now than all my previous attempts, thanks.

But it's still not 100% same as using LoRA in WebUI... Any way to make it work the same?

It would be much appreciated if you could make a doc where you generate the same thing with Diffusers as you generate with WebUI :)

kopyl commented 1 year ago

@kohya-ss by the way, is there any way to load StableDiffusionPipeline with .ckpt instead of a Diffusers format?

Without a prior conversion.

I know WebUI somehow does it, but i can't figure out the way :(

kopyl commented 1 year ago

@kohya-ss it feels like that the prompt does not influence the generation at all :(

kopyl commented 1 year ago

@kohya-ss i just figured out that it's not that LoRA apply to Diffusers in a wrong way.

For some reason Diffusers generate completely different result than WebUI.

hkunzhe commented 1 year ago

Hi, I wrote a document on how to use LoRA with Diffusers standard pipe. I will publish it soon, but this is a current copy. I think you can use any arguments for the pipe, and Long Prompt Weighting Stable Diffusion for the weighting.

I hope this helps you.

import torch
from diffusers import StableDiffusionPipeline
from networks.lora import LoRAModule, create_network_from_weights
from safetensors.torch import load_file

# if the ckpt is CompVis based, convert it to Diffusers beforehand with tools/convert_diffusers20_original_sd.py. See --help for more details.

model_id_or_dir = r"model_id_on_hugging_face_or_dir"
device = "cuda"

# create pipe
print(f"creating pipe from {model_id_or_dir}...")
pipe = StableDiffusionPipeline.from_pretrained(model_id_or_dir, revision="fp16", torch_dtype=torch.float16)
pipe = pipe.to(device)
vae = pipe.vae
text_encoder = pipe.text_encoder
unet = pipe.unet

# load lora networks
print(f"loading lora networks...")

lora_path1 = r"lora1.safetensors"
sd = load_file(lora_path1)   # If the file is .ckpt, use torch.load instead.
network1, sd = create_network_from_weights(0.5, None, vae, text_encoder,unet, sd)
network1.apply_to(text_encoder, unet)
network1.load_state_dict(sd)
network1.to(device, dtype=torch.float16)

# # You can merge weights instead of apply_to+load_state_dict. network.set_multiplier does not work
# network.merge_to(text_encoder, unet, sd)

lora_path2 = r"lora2.safetensors"
sd = load_file(lora_path2) 
network2, sd = create_network_from_weights(0.7, None, vae, text_encoder,unet, sd)
network2.apply_to(text_encoder, unet)
network2.load_state_dict(sd)
network2.to(device, dtype=torch.float16)

lora_path3 = r"lora3.safetensors"
sd = load_file(lora_path3)
network3, sd = create_network_from_weights(0.5, None, vae, text_encoder,unet, sd)
network3.apply_to(text_encoder, unet)
network3.load_state_dict(sd)
network3.to(device, dtype=torch.float16)

# prompts
prompt = "masterpiece, best quality, 1girl, in white shirt, looking at viewer"
negative_prompt = "bad quality, worst quality, bad anatomy, bad hands"

# exec pipe
print("generating image...")
with torch.autocast("cuda"):
    image = pipe(prompt, guidance_scale=7.5, negative_prompt=negative_prompt).images[0]

# if not merged, you can use set_multiplier
# network1.set_multiplier(0.8)
# and generate image again...

# save image
image.save(r"by_diffusers..png")

Hi,where is the document?

kopyl commented 1 year ago

@hkunzhe does it make the same results as a1111's WebUI?

cryppadotta commented 2 months ago

diffusers supports loading kohya loras now