google / prompt-to-prompt

Apache License 2.0
3.07k stars 285 forks source link

Question about the influence of softmax function on the issue of attention map swapping. #58

Open PeiqinZhuang opened 1 year ago

PeiqinZhuang commented 1 year ago

Hi, I found that attention map swapping is performed after the softmax operation. In that case, the sum of those similarities could not be equal to 1. I wonder if the authors have tried to conduct attention map swapping before the softmax operation.

g-jing commented 1 year ago

I guess that before softmax, the absolute value does not has a specific meaning out of the context