HazyResearch / prefix-linear-attention

41 stars 1 forks source link

如何使用您的线性注意 #1

Open huguangcheng opened 1 week ago

huguangcheng commented 1 week ago

你好,我觉得您的前缀线性注意很棒!请问我该如何加入到我的模型代码中呢?我用一个mamba基线的ssm模型

simran-arora commented 4 days ago

Hi thank you! I think the mamba kernels do not yet support bidirectional modeling, so it might be difficult to train Mamba-1 with it for the encoder-decoder prefix LM.

You can try running JRT prompt using the code in the lm-eval-harness folder! https://github.com/HazyResearch/prefix-linear-attention/blob/main/lm-eval-harness/prompt_scripts/run_jrt_prompt_hf.sh