jy0205 / Pyramid-Flow

Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
https://pyramid-flow.github.io/
MIT License
1.86k stars 158 forks source link

Full 3D attention #120

Closed agneet42 closed 7 hours ago

agneet42 commented 1 day ago

Hi, thanks for the great work!

As mentioned in the paper, you employ full 3D attention instead of factorized spatial and temporal attention. I was wondering if this is the LOC that refers to the same : https://github.com/jy0205/Pyramid-Flow/blob/main/pyramid_dit/modeling_mmdit_block.py#L396 ; if not would you be able to point me to the same?

image
jy0205 commented 8 hours ago

Yes, this class is full-sequence attention.

agneet42 commented 7 hours ago

Thanks!