tencent-ailab / IP-Adapter

The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
Apache License 2.0
4.46k stars 289 forks source link

Using Prompt_Embeds in IPAdapter Generation? #381

Open NathiasPhaniel opened 3 weeks ago

NathiasPhaniel commented 3 weeks ago

Hey, this is my first post. I wanted to ask about how one implements prompt weighting within the architecture.

This is the base generation code, which works. image = ip_model.generate( prompt=prompt, negative_prompt=total_negative_prompt, faceid_embeds=average_embedding, scale=likeness_strength, width=864, height=1152, guidance_scale=face_strength, num_inference_steps=30, num_samples=4

However, how does one implement prompt weighting? I attempted to use Compel prompt weighting, to no avail.

compel_proc = Compel(
              tokenizer=[pipe.tokenizer, pipe.tokenizer_2] ,
              text_encoder=[pipe.text_encoder, pipe.text_encoder_2],
              returned_embeddings_type=ReturnedEmbeddingsType.PENULTIMATE_HIDDEN_STATES_NON_NORMALIZED,
              requires_pooled=[False, True]
            )
    total_negative_prompt = negative_prompt

    conditioning, pooled = compel_proc(prompt)
    neg_conditioning, neg_pooled = compel_proc(total_negative_prompt)

    print("Generating SDXL")
    image = ip_model.generate(
      prompt_embeds=conditioning,
      negative_prompt_embeds=neg_conditioning, 
      pooled_prompt_embeds=pooled, 
      negative_pooled_prompt_embeds=neg_pooled, 
      faceid_embeds=average_embedding,
      scale=likeness_strength, 
      width=864, 
      height=1152, 
      guidance_scale=face_strength, 
      num_inference_steps=20, 
      num_samples=1
    )

Yields: Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/gradio/queueing.py", line 521, in process_events response = await route_utils.call_process_api( File "/usr/local/lib/python3.10/dist-packages/gradio/route_utils.py", line 276, in call_process_api output = await app.get_blocks().process_api( File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 1935, in process_api result = await self.call_function( File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 1513, in call_function prediction = await anyio.to_thread.run_sync( File "/usr/local/lib/python3.10/dist-packages/anyio/to_thread.py", line 33, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread return await future File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 807, in run result = context.run(func, *args) File "/usr/local/lib/python3.10/dist-packages/gradio/utils.py", line 832, in wrapper response = f(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/gradio/utils.py", line 832, in wrapper response = f(*args, **kwargs) File "/content/h94-IP-Adapter-FaceID-SDXL/app.py", line 89, in generate_image image = ip_model.generate( File "/content/h94-IP-Adapter-FaceID-SDXL/ipown.py", line 458, in generate images = self.pipe( TypeError: StableDiffusionXLPipeline { "_class_name": "StableDiffusionXLPipeline", "_diffusers_version": "0.28.2", "feature_extractor": [ null, null ], "force_zeros_for_empty_prompt": true, "image_encoder": [ null, null ], "scheduler": [ "diffusers", "DDIMScheduler" ], "text_encoder": [ "transformers", "CLIPTextModel" ], "text_encoder_2": [ "transformers", "CLIPTextModelWithProjection" ], "tokenizer": [ "transformers", "CLIPTokenizer" ], "tokenizer_2": [ "transformers", "CLIPTokenizer" ], "unet": [ "diffusers", "UNet2DConditionModel" ], "vae": [ "diffusers", "AutoencoderKL" ] } got multiple values for keyword argument 'prompt_embeds'

Thus, I cannot use this method to generate IP Adapter images with prompt weighting. What will get this working? (I also took a look in ipown.py, and the generate function does not seem to take prompt_embeds and the like as arguments.)

Thank you for your time and understanding.

xiaohu2015 commented 1 week ago

IPA has been implemented in diffusers, maybe you can try with diffusers?