Open kopyl opened 1 year ago
sd-scripts
repo has a script to generate images called gen_img_diffusers.py
. https://github.com/kohya-ss/sd-scripts
Unfortunately the document for the script is in Japanese and outdated, but I think it will help (please ignore the installation process in the document). https://note.com/kohya_ss/n/n2693183a798e
@kohya-ss i'm unable to get the same result as with your LoRA in WebUI. Same settings, different results.
Plus i need some kind of a pipeline, so i don't load it each time i generate an image as with the script.
There is no info on your LoRA on https://note.com/kohya_ss/n/n2693183a798e :(
Unfortunately there is no pipeline for LoRA currently...
Sorry for the lack of the documentation. The command line with LoRA is like followings.
single LoRA:
python gen_img_diffusers.py --ckpt name_to_ckpt.safetensors --n_iter 1 --scale 8 --steps 40 --outdir txt2img/samples --xformers --W 512 --H 512 --fp16 --sampler k_euler_a --network_module networks.lora --network_weights lora1.safetensors --network_mul 1.0 --max_embeddings_multiples 3 --clip_skip 2 --batch_size 1 --images_per_prompt 1 --prompt "beautiful scene --n negative prompt"
two LoRAs:
python gen_img_diffusers.py --ckpt name_to_ckpt.safetensors --n_iter 1 --scale 8 --steps 40 --outdir txt2img/samples --xformers --W 512 --H 512 --fp16 --sampler k_euler_a --network_module networks.lora networks.lora --network_weights lora1.safetensors lora2.safetensors --network_mul 1.0 0.8 --max_embeddings_multiples 3 --clip_skip 2 --batch_size 1 --images_per_prompt 1 --prompt "beautiful scene --n negative prompt"
@kohya-ss no problem, I was able to run it with LoRA. Thanks for your answers :)
I tried it but the result is not the same as in WebUI :(
Currently, I'm using WebUI API to programmatically make generation, but I want to get rid of this crap in favor of more stable solutions like Diffusers...
@kohya-ss hey :)
Sorry for the delay. I'm trying to figure out how to easily infer with the standard pipeline of Diffusers. Please wait a while.
@kohya-ss no problem, thank you very much for the update, you're the best ❤
@kohya-ss or maybe not so standard... Something which does not require a heavy AUTOMATIC1111 to do the inference.
I was trying to reverse engineer this, but had no luck https://github.com/replicate/lora-inference :(
Hi, I wrote a document on how to use LoRA with Diffusers standard pipe. I will publish it soon, but this is a current copy. I think you can use any arguments for the pipe, and Long Prompt Weighting Stable Diffusion for the weighting.
I hope this helps you.
import torch
from diffusers import StableDiffusionPipeline
from networks.lora import LoRAModule, create_network_from_weights
from safetensors.torch import load_file
# if the ckpt is CompVis based, convert it to Diffusers beforehand with tools/convert_diffusers20_original_sd.py. See --help for more details.
model_id_or_dir = r"model_id_on_hugging_face_or_dir"
device = "cuda"
# create pipe
print(f"creating pipe from {model_id_or_dir}...")
pipe = StableDiffusionPipeline.from_pretrained(model_id_or_dir, revision="fp16", torch_dtype=torch.float16)
pipe = pipe.to(device)
vae = pipe.vae
text_encoder = pipe.text_encoder
unet = pipe.unet
# load lora networks
print(f"loading lora networks...")
lora_path1 = r"lora1.safetensors"
sd = load_file(lora_path1) # If the file is .ckpt, use torch.load instead.
network1, sd = create_network_from_weights(0.5, None, vae, text_encoder,unet, sd)
network1.apply_to(text_encoder, unet)
network1.load_state_dict(sd)
network1.to(device, dtype=torch.float16)
# # You can merge weights instead of apply_to+load_state_dict. network.set_multiplier does not work
# network.merge_to(text_encoder, unet, sd)
lora_path2 = r"lora2.safetensors"
sd = load_file(lora_path2)
network2, sd = create_network_from_weights(0.7, None, vae, text_encoder,unet, sd)
network2.apply_to(text_encoder, unet)
network2.load_state_dict(sd)
network2.to(device, dtype=torch.float16)
lora_path3 = r"lora3.safetensors"
sd = load_file(lora_path3)
network3, sd = create_network_from_weights(0.5, None, vae, text_encoder,unet, sd)
network3.apply_to(text_encoder, unet)
network3.load_state_dict(sd)
network3.to(device, dtype=torch.float16)
# prompts
prompt = "masterpiece, best quality, 1girl, in white shirt, looking at viewer"
negative_prompt = "bad quality, worst quality, bad anatomy, bad hands"
# exec pipe
print("generating image...")
with torch.autocast("cuda"):
image = pipe(prompt, guidance_scale=7.5, negative_prompt=negative_prompt).images[0]
# if not merged, you can use set_multiplier
# network1.set_multiplier(0.8)
# and generate image again...
# save image
image.save(r"by_diffusers..png")
@kohya-ss thanks a lot. Could you please tell me how to install the dependencies which do not conflict with each other?
Especially for these lines:
from networks.lora import LoRAModule, create_network_from_weights
from safetensors.torch import load_file
@kohya-ss i tried it with trial and errors. It's muuuuch better now than all my previous attempts, thanks.
But it's still not 100% same as using LoRA in WebUI... Any way to make it work the same?
It would be much appreciated if you could make a doc where you generate the same thing with Diffusers as you generate with WebUI :)
@kohya-ss by the way, is there any way to load StableDiffusionPipeline with .ckpt instead of a Diffusers format?
Without a prior conversion.
I know WebUI somehow does it, but i can't figure out the way :(
@kohya-ss it feels like that the prompt does not influence the generation at all :(
@kohya-ss i just figured out that it's not that LoRA apply to Diffusers in a wrong way.
For some reason Diffusers generate completely different result than WebUI.
Hi, I wrote a document on how to use LoRA with Diffusers standard pipe. I will publish it soon, but this is a current copy. I think you can use any arguments for the pipe, and Long Prompt Weighting Stable Diffusion for the weighting.
I hope this helps you.
import torch from diffusers import StableDiffusionPipeline from networks.lora import LoRAModule, create_network_from_weights from safetensors.torch import load_file # if the ckpt is CompVis based, convert it to Diffusers beforehand with tools/convert_diffusers20_original_sd.py. See --help for more details. model_id_or_dir = r"model_id_on_hugging_face_or_dir" device = "cuda" # create pipe print(f"creating pipe from {model_id_or_dir}...") pipe = StableDiffusionPipeline.from_pretrained(model_id_or_dir, revision="fp16", torch_dtype=torch.float16) pipe = pipe.to(device) vae = pipe.vae text_encoder = pipe.text_encoder unet = pipe.unet # load lora networks print(f"loading lora networks...") lora_path1 = r"lora1.safetensors" sd = load_file(lora_path1) # If the file is .ckpt, use torch.load instead. network1, sd = create_network_from_weights(0.5, None, vae, text_encoder,unet, sd) network1.apply_to(text_encoder, unet) network1.load_state_dict(sd) network1.to(device, dtype=torch.float16) # # You can merge weights instead of apply_to+load_state_dict. network.set_multiplier does not work # network.merge_to(text_encoder, unet, sd) lora_path2 = r"lora2.safetensors" sd = load_file(lora_path2) network2, sd = create_network_from_weights(0.7, None, vae, text_encoder,unet, sd) network2.apply_to(text_encoder, unet) network2.load_state_dict(sd) network2.to(device, dtype=torch.float16) lora_path3 = r"lora3.safetensors" sd = load_file(lora_path3) network3, sd = create_network_from_weights(0.5, None, vae, text_encoder,unet, sd) network3.apply_to(text_encoder, unet) network3.load_state_dict(sd) network3.to(device, dtype=torch.float16) # prompts prompt = "masterpiece, best quality, 1girl, in white shirt, looking at viewer" negative_prompt = "bad quality, worst quality, bad anatomy, bad hands" # exec pipe print("generating image...") with torch.autocast("cuda"): image = pipe(prompt, guidance_scale=7.5, negative_prompt=negative_prompt).images[0] # if not merged, you can use set_multiplier # network1.set_multiplier(0.8) # and generate image again... # save image image.save(r"by_diffusers..png")
Hi,where is the document?
@hkunzhe does it make the same results as a1111's WebUI?
Or any other tool to generate images programmatically in python without installing webIUI.