huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
https://huggingface.co/docs/diffusers
Apache License 2.0
25.2k stars 5.21k forks source link

Support out_dim argument for Attention block #7877

Open tigerlittle1 opened 4 months ago

tigerlittle1 commented 4 months ago

Is your feature request related to a problem? Please describe. When i feed the out_dim argument in __init__ in Attention block it will raise the shape error, because the query_dim != out_dim. In this case, the following code try to keep the given channel of hidden_states.

https://github.com/huggingface/diffusers/blob/b69fd990ad8026f21893499ab396d969b62bb8cc/src/diffusers/models/attention_processor.py#L1393 But it should change the channel as the output of hidden_states = attn.to_out[0](hidden_states).

Describe the solution you'd like. I suggest the change of code base : https://github.com/huggingface/diffusers/blob/b69fd990ad8026f21893499ab396d969b62bb8cc/src/diffusers/models/attention_processor.py#L1393 to hidden_states = hidden_states.transpose(-1, -2).reshape(batch_size, -1, height, width), then it will respect the channel of hidden_states. Maybe I will make a PR later.

Describe alternatives you've considered. None.

Additional context. None.

github-actions[bot] commented 5 days ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.