wooyeolBaek / attention-map

🚀 Cross attention map tools for huggingface/diffusers
https://huggingface.co/spaces/We-Want-GPU/diffusers-cross-attention-map-SDXL-t2i
MIT License
150 stars 9 forks source link

Attention map for DiT-based model #11

Open xduzhangjiayu opened 1 week ago

xduzhangjiayu commented 1 week ago

Hi, Thanks for the excellent work! Will this project support Attention map visualization for DiT-based (SD3, FLUX...) model?

wooyeolBaek commented 5 days ago

@xduzhangjiayu It doesn't support for DiT-based models right now, and I am planning to modify it in the next update. However, due to time constraints, I don't think I'll be able to do it for a while.

If you need to use it urgently, I recommend changing the AttnProcessor to AttnProcessor2_0, and then redefining the call method of the module that uses AttnProcessor in DiT to override it.

xduzhangjiayu commented 5 days ago

@wooyeolBaek Thanks for the reply, I will try it, also look forward to your update!

xduzhangjiayu commented 5 days ago

Hi, Do you have any advice for this? For U-Net based model, we can use Q (image) * K (Text) to get attention score, but for DiT-based model, image and Text both have Q,K,V,so I'm very confused about this. Any suggestions would be appreciated, thanks!

wooyeolBaek commented 5 days ago

@xduzhangjiayu As far as I know, since the image and text hidden states are concatenated for an attention operation, the resulting matrix can be viewed as performing self-attention and cross-attention simultaneously. The upper-left and bottom-right parts of the matrix represent self-attention, while the upper-right and bottom-left parts represent cross-attention. To obtain the attention map as used in Stable Diffusion 1, you can extract the upper-right attention map, where the image is used as the query and the text as the key.

xduzhangjiayu commented 2 days ago

@wooyeolBaek Many thanks for the advice, I have tried your method and it seems works!

wooyeolBaek commented 1 day ago

@xduzhangjiayu I'm glad it worked well! I've also added features to make it compatible with SD3 and support batch operations, and I've also refactored it for more intuitive application, so feel free to refer to it if needed.

xduzhangjiayu commented 23 hours ago

@wooyeolBaek Thanks for your notice and this awesome project !