Open wusize opened 3 months ago
Hi!
A big thanks for your impressive work! Since full attention and causal attention are both included, I am curious how you implemented such attention masks if flash attention is used.
Best regards
Hi, We will try to implement it. Here's a potential solution to your question https://github.com/showlab/Show-o/issues/8.
Hi!
A big thanks for your impressive work! Since full attention and causal attention are both included, I am curious how you implemented such attention masks if flash attention is used.
Best regards