facebookresearch / xformers

Hackable and optimized Transformers building blocks, supporting a composable construction.
https://facebookresearch.github.io/xformers/
Other
8.11k stars 573 forks source link

attention's flop calculation when casual is set to True. #1033

Open kf-zhang opened 3 months ago

kf-zhang commented 3 months ago

❓ Questions and Help

I'm currently trying to comprehend the attention flop calculation as defined here. However, I am facing confusion regarding this specific section, which pertains to the flop calculation when 'casual' is set to True. It seems that the flop is incorrect when query's length is different from key-value' s length.

danthe3rd commented 3 months ago

It seems that the flop is incorrect when query's length is different from key-value' s length

Yes indeed, you are right. I guess we also need to distinguish between causal from topleft / bottomright when num_kv != num_q. This is not passed in the API at the moment. Out of curiosity, what are you using this function for?

kf-zhang commented 3 months ago

I'm trying to calculate mfu and understand how flop is calculated. Many papers describe their system's efficiency using mfu, but few explain how to calculate flop.