Hi, thank you for your amazing work! I read the whole paper and I really got inspired a lot!
I have a little bit confusion here regarding the line76 inside the file src/linfusion/attention.py, it seems like it is using a function called self.batch_to_head_dim. Same thing happens again in the file src/distrifuser/modules/pp/attn.py at line 247. May I ask is this a typo for the function head_to_batch_dim or there is a missing import at those two places?
Hi, thank you for your amazing work! I read the whole paper and I really got inspired a lot!
I have a little bit confusion here regarding the line76 inside the file
src/linfusion/attention.py
, it seems like it is using a function calledself.batch_to_head_dim
. Same thing happens again in the filesrc/distrifuser/modules/pp/attn.py
at line 247. May I ask is this a typo for the functionhead_to_batch_dim
or there is a missing import at those two places?