huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
https://huggingface.co/docs/diffusers
Apache License 2.0
26.32k stars 5.42k forks source link

Instruct-pix2pix pipeline: add ability to pass `cross_attention_kwargs` in call method #7962

Closed AlexeyZhuravlev closed 6 months ago

AlexeyZhuravlev commented 6 months ago

Is your feature request related to a problem? Please describe. I have a use-case where I need to use custom adapter together with instruct-pix2pix model, which requires passing custom cross_attention_kwargs into unet. However, having ability to pass this parameter would allow other use-cases as well, like LoRAs. Basically any use-case where this parameter is used for StableDiffusionPipeline should also be relevant for StableDiffusionInstructPix2PixPipeline as well.

Describe the solution you'd like. Add a new cross_attention_kwargs parameter to __call__ method of StableDiffusionInstructPix2PixPipeline. Pass this parameter to unet call inside the method.

Describe alternatives you've considered. Other hacky solutions with context manager, which manipulate UNet tensors might work, but they would overcomplicate the code. Also it's hard to make such solutions work properly with torch.compile

AlexeyZhuravlev commented 6 months ago

I opened the PR which solves this issue: https://github.com/huggingface/diffusers/pull/7961

yiyixuxu commented 6 months ago

fixed with #7961!