From what I understand, cross attention control modifies the attention map to make edits, but memory efficient attention doesn't compute attention in the same way, and doesn't explicitly compute the attention map. How can we tweak the memory efficient attention formula to support cross attention control? Is it possible to use both together?
Hi awesome paper!
Is it possible to integrate cross attention control mechanism in the memory efficient attention formula?
From what I understand, cross attention control modifies the attention map to make edits, but memory efficient attention doesn't compute attention in the same way, and doesn't explicitly compute the attention map. How can we tweak the memory efficient attention formula to support cross attention control? Is it possible to use both together?
Thank you!