RoyiRa / prompt-to-prompt-with-sdxl

An implementation of the Prompt-to-Prompt paper for the SDXL architecture
93 stars 5 forks source link

attention visualization #3

Open betterze opened 9 months ago

betterze commented 9 months ago

Dear RoyiRa,

Thank you for your great implementation. I really like it.

In the original p2p notebook, there is a function to visualize the attention per token. Is there a way to visualize the attention per token in SDXL in this notebook?

Thank you for your help.

Best Wishes, Zongze

betterze commented 9 months ago

I see the _aggregate_and_get_attention_maps_per_token function, could you tell me how to use it? thx

zc1023 commented 5 months ago

I see the _aggregate_and_get_attention_maps_per_token function, could you tell me how to use it? thx

I also wonder to know how to visualize this. If you achieve it, please tell me. THX!

RoyiRa commented 5 months ago

What does ``_aggregate_and_get_attention_maps_per_token'' do? The function goes over the cross attentions of the UNet (you can change the directions it takes the attention from, but by default it's all three: down,mid, and up.

Then, it aggregates the cross attention maps to their average, which results in a single cross attention tensor of shape (D,D,77), where 77 is the number of tokens. Sometimes I use softmax to normalize the maps.

Visualization I can send you my personal visualization code if that helps. I'm at work atm, but will do so today or tomorrow.

On Mon, 3 Jun 2024 at 10:08, Chao Zhou @.***> wrote:

I see the _aggregate_and_get_attention_maps_per_token https://github.com/RoyiRa/prompt-to-prompt-with-sdxl/blob/main/prompt_to_prompt_pipeline.py#L122 function, could you tell me how to use it? thx

I also wonder to know how to visualize this. If you achieve it, please tell me. THX!

— Reply to this email directly, view it on GitHub https://github.com/RoyiRa/prompt-to-prompt-with-sdxl/issues/3#issuecomment-2144434487, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJBEBF54CE6NSJT5DAF4JCTZFQI6DAVCNFSM6AAAAABEA6GPVCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNBUGQZTINBYG4 . You are receiving this because you are subscribed to this thread.Message ID: @.***>

zc1023 commented 5 months ago

What does ``_aggregate_and_get_attention_maps_per_token'' do? The function goes over the cross attentions of the UNet (you can change the directions it takes the attention from, but by default it's all three: down,mid, and up. Then, it aggregates the cross attention maps to their average, which results in a single cross attention tensor of shape (D,D,77), where 77 is the number of tokens. Sometimes I use softmax to normalize the maps. Visualization I can send you my personal visualization code if that helps. I'm at work atm, but will do so today or tomorrow. On Mon, 3 Jun 2024 at 10:08, Chao Zhou @.> wrote: I see the _aggregate_and_get_attention_maps_per_token https://github.com/RoyiRa/prompt-to-prompt-with-sdxl/blob/main/prompt_to_prompt_pipeline.py#L122 function, could you tell me how to use it? thx I also wonder to know how to visualize this. If you achieve it, please tell me. THX! — Reply to this email directly, view it on GitHub <#3 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJBEBF54CE6NSJT5DAF4JCTZFQI6DAVCNFSM6AAAAABEA6GPVCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNBUGQZTINBYG4 . You are receiving this because you are subscribed to this thread.Message ID: @.> Thank you!

zc1023 commented 5 months ago

I achieve it through the shou_cross_attention function in p2p_stable.ipynb