tencent-ailab / IP-Adapter

The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
Apache License 2.0
5.23k stars 337 forks source link

Got very bad result with LCM-lora #144

Closed blx0102 closed 11 months ago

blx0102 commented 11 months ago

Hi! I tried to use IPAdapter with LCM-lora like below, code is borrowed from https://huggingface.co/latent-consistency/lcm-lora-sdv1-5:

model_id = "models/flat2DAnimerge"
adapter_id = "models/loras/pytorch_lora_weights.safetensors"
image_encoder_path = "./models/image_encoder"
ip_ckpt_plus_face = "./models/ip-adapter-plus-face_sd15.bin"

pipe = AutoPipelineForText2Image.from_pretrained(model_id, torch_dtype=torch.float16, variant="fp16", safety_checker=None)
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)
pipe.to("cuda")

# load and fuse lcm lora
pipe.load_lora_weights(adapter_id)
pipe.fuse_lora()
ip_model_plus = IPAdapterPlus(pipe, image_encoder_path, ip_ckpt_plus_face, "cuda", num_tokens=16)

prompt = "Self-portrait oil painting, a beautiful cyborg with golden hair, 8k"

# disable guidance_scale by passing 0
img = Image.open("face.png")
image = img.resize((256,256))image = ip_model_plus.generate(pil_image=image, num_inference_steps=4, prompt=prompt, scale=0.5)[0]
image.save("1.png") 

But the result is very bad: lcm

Below is the result of text-to-image without IPAdapter when using LCM-lora: 1

xiaohu2015 commented 11 months ago

@blx0102 do you also tested ip-adapter-sd15? and you can also try to increase inference steps?

blx0102 commented 11 months ago

@xiaohu2015 Yes, I've tried ip-adapter_sd15, ip-adapter-plus_sd15, ip-adapter-plus-face_sd15, they all output image like above. Also, increasing infer steps didn't help.

xiaohu2015 commented 11 months ago

it seems to work with LCM lora https://github.com/huggingface/diffusers/pull/5713, maybe I should do some tests

blx0102 commented 11 months ago

@xiaohu2015 Thanks! I will upgrade my diffusers' version and try that code.

maximepeabody commented 11 months ago

Make sure your guidance scale is set to < 2.0 (works well with 1.0). if it's using the default 7.5 or 8, then it will lead to bad results

blx0102 commented 11 months ago

Make sure your guidance scale is set to < 2.0 (works well with 1.0). if it's using the default 7.5 or 8, then it will lead to bad results

@maximepeabody Yeah, that's the point! This perfectly solved this problem.