ofirpress / attention_with_linear_biases

Code for the ALiBi method for transformer language models (ICLR 2022)
MIT License
506 stars 39 forks source link

can you tell me how to use alibi while fine-tuning LLAMA model? #15

Closed kiran1501 closed 1 year ago

kiran1501 commented 1 year ago

❓ Questions and Help

Before asking:

  1. search the issues.
  2. search the docs.

What is your question?

Code

#### What have you tried? #### What's your environment? - fairseq Version (e.g., 1.0 or master): - PyTorch Version (e.g., 1.0) - OS (e.g., Linux): - How you installed fairseq (`pip`, source): - Build command you used (if compiling from source): - Python version: - CUDA/cuDNN version: - GPU models and configuration: - Any other relevant information:
ofirpress commented 1 year ago

AFAIK it is not possible to apply ALiBi to a model that was trained with ROPE, as in LLaMA. Please checkout MPT-7B if you want a model pretrained with ALiBi