Added causal mask in Attention forward pass

hpcaitech / Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

https://hpcaitech.github.io/Open-Sora/

Apache License 2.0

21.76k stars 2.1k forks source link

Closed BurkeHulk closed 3 months ago

BurkeHulk commented 3 months ago

Tested the attention forward function with dummy inputs. Make sure the outputs are same among flash_attn_func and torch attention computation.