如何使用您的线性注意 - Githubissues

HazyResearch / prefix-linear-attention

41 stars 1 forks source link

如何使用您的线性注意 #1

Open huguangcheng opened 1 week ago

huguangcheng commented 1 week ago

你好，我觉得您的前缀线性注意很棒！请问我该如何加入到我的模型代码中呢？我用一个mamba基线的ssm模型

simran-arora commented 4 days ago

Hi thank you! I think the mamba kernels do not yet support bidirectional modeling, so it might be difficult to train Mamba-1 with it for the encoder-decoder prefix LM.

You can use a PyTorch version of Mamba-2, or try our architectures
Here is an example of how we modify linear attention in pytorch: https://github.com/HazyResearch/prefix-linear-attention/blob/7490f22bda3e38dc057bcbaffe6bdb09b4d475e6/based/models/mixers/prefix_linear_attention.py#L200

You can try running JRT prompt using the code in the lm-eval-harness folder! https://github.com/HazyResearch/prefix-linear-attention/blob/main/lm-eval-harness/prompt_scripts/run_jrt_prompt_hf.sh