implementation for multiple images, image weighting and negative embeds

Finally took the time to implement the missing features of the diffusers implementation. I also simplified code and streamlined the workflow. You only have one main IPAdapter class (not one for each model) that takes care of everything.

The core of the execution looks like this

# ...

reference = Image.open("reference_image.jpg")

# doesn't matter what model you are using, they all use the IPAdapter class
ip_adapter = IPAdapter(pipe, "ipdapter/model/path", "image/encoder/path", device=device)

# exports the text+image embeds
prompt_embeds, negative_prompt_embeds = ip_adapter.get_prompt_embeds(
    reference,
    prompt="positive prompt",
    negative_prompt="blurry,",
)

# use the pipe as always attaching the exported embeds
image = pipe(
    prompt_embeds=prompt_embeds,
    negative_prompt_embeds=negative_prompt_embeds,
    num_inference_steps=30,
    guidance_scale=6.0,
    generator=generator,
).images[0]
image.save("image.webp", lossless=True, quality=100)

To send multiple images is as simple as:

reference1 = Image.open("reference_image_1.jpg")
reference2 = Image.open("reference_image_2.jpg")

prompt_embeds, negative_prompt_embeds = ip_adapter.get_prompt_embeds(
    [reference1, reference2],
    prompt="positive prompt",
    negative_prompt="blurry,",
)

It is also possible to send negative images, this is important as per my experimentation you can sometimes get better results. You can use any image as negative but it seems to work better with just random noise. This is an example:

reference (no noise)	gaussian noise	mandelbrot noise

And of course you can give a weight if you send multiple images, but I'm sure there's some better normalization that could be done (but it kinda works).

Please note that I don't have much experience with diffusers, not sure what are the best practices and the code structure might change in coming days. Any feedback is welcome

You can find the MIT licensed code and a lot of the examples here: https://github.com/cubiq/Diffusers_IPAdapter

Let me thank again Tencent AILab for making the IPAdapter models public.

tencent-ailab / IP-Adapter

implementation for multiple images, image weighting and negative embeds #99