Open axelmagn opened 7 months ago
BASED is an attention model which combines sliding window attention and global linear attention to capture similar dependencies to transformers in a subquadratic model.
It outperforms other similar models such as Mamba.
Hi I'm curious if it's possible to add in these models. Is there anything I can do to speed it along?
Model description
BASED is an attention model which combines sliding window attention and global linear attention to capture similar dependencies to transformers in a subquadratic model.
It outperforms other similar models such as Mamba.
Open source status
Provide useful links for the implementation