ofirpress attention_with_linear_biases issues

ofirpress / attention_with_linear_biases

Code for the ALiBi method for transformer language models (ICLR 2022)

MIT License

496 stars 38 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

ALiBi during inference

#20 VarunGumma closed 2 months ago
1
Imeplementation about ALibi

#19 DreamShibei closed 7 months ago
1
implementation detail about alibi_mask

#18 bugm opened 9 months ago
0
For a value of `12` I'm seeing a jump in the plotting of values, e.g. `0.7` shown below.

#17 razodactyl opened 1 year ago
0
What is the extrapolation method used in the paper?

#16 XinyuDu closed 1 year ago
1
can you tell me how to use alibi while fine-tuning LLAMA model?

#15 kiran1501 closed 1 year ago
1
Have you initialized the model with other model checkpoints during training?

#14 Victoriaheiheihei closed 1 year ago
1
could we apply Alibi with rotary position embedding?

#13 xiaoxiawu-microsoft closed 1 year ago
1
Is there any easy way to get a HF compatible version of your checkpoints?

#12 petroskarypis closed 1 year ago
2
How can I apply ALiBi Position Encoding into huggingface model?

#11 hjsg1010 closed 1 year ago
3
How to perform sliding window evaluation?

#10 chijames closed 2 years ago
2
The numerical value of ALiBi attn_mask

#9 chijames closed 2 years ago
3
ALiBi in Parallel Attention

#8 conceptofmind closed 2 years ago
2
Explanation regarding multiplying linear biases with q.k^T

#7 sayakpaul closed 2 years ago
4
Integration with `transformers`

#6 sayakpaul closed 2 years ago
1
Modifying ALiBi for Encoder-Attention or Cross-Attention

#5 ofirpress opened 2 years ago
29
ALiBi in self-Attention

#4 Ldoun closed 2 years ago
2
Abili on LongformerEncoderDecoder

#3 beaupranisaa closed 2 years ago
1
unrecognized argument --max-lr

#2 cifkao closed 2 years ago
3
Fix preprocess.py path

#1 cifkao closed 2 years ago
3