Closed jquesnelle closed 4 months ago
This allows changing RoPE's theta hyperparameter for Llama models. For example, Llama 3 uses theta = 500000 instead of the default 10000.
theta = 500000
10000
Thanks for the PR. Merged!
This allows changing RoPE's theta hyperparameter for Llama models. For example, Llama 3 uses
theta = 500000
instead of the default10000
.