attention mask for transformer Flux

huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.

https://huggingface.co/docs/diffusers

Apache License 2.0

26.41k stars 5.43k forks source link

attention mask for transformer Flux #10025

Open christopher5106 opened 3 days ago

christopher5106 commented 3 days ago

Describe the bug

Is it possible to get back the attention_mask argument in the flux attention processor

hidden_states = F.scaled_dot_product_attention(query, key, value, dropout_p=0.0, is_causal=False,attn_mask=attention_mask)

https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/attention_processor.py#L1910

in order to tweak things a bit ? otherwise the argument attention_mask is unused.

Thanks a lot

Reproduction

pip install diffusers

Logs

No response

System Info

Ubuntu

Who can help?

@yiyixuxu @sayakpaul @DN6 @asomoza

sayakpaul commented 3 days ago

Cc: @yiyixuxu

yiyixuxu commented 3 days ago

hey @christopher5106 yes! do you want to open a PR?

rootonchair commented 1 day ago

Hi @yiyixuxu, I am working on this issue and it seems like attention_mask is not being used by all the pipelines. Could you help me finding a case that an attention mask is being used and passed to attention processsor?

sayakpaul commented 1 day ago

Thanks, @rootonchair! The reason why that is not the case is because the original Flux implementation doesn't really use any mask so the Flux related pipelines don't use them. So, if we were actually use the attention mask in the Flux attention processor, users will have to make sure to pass them accordingly in their implementations.