meta-llama / llama-models

Utilities intended for use with Llama models.
Other
4.48k stars 787 forks source link

RoPE scale factor for llama3.2 #166

Open NTT123 opened 2 weeks ago

NTT123 commented 2 weeks ago

We are using a scale factor of 8 in the reference implementation, which seems to match with Hugging Face config files for 3.1 models.

However, I observed that the new 3.2 models use a scale factor of 32 (https://huggingface.co/meta-llama/Llama-3.2-1B/blob/main/config.json#L23). I wonder if this can cause any potential issues? https://github.com/meta-llama/llama-models/blob/4269717b2ea587627903bacbb75ccce1427ad914/models/llama3/reference_impl/model.py#L47

karan-dalal commented 1 week ago

+1

karan-dalal commented 2 days ago

After running some long context eval, 32 seems to be the correct scale factor @NTT123