Closed copybara-service[bot] closed 2 months ago
Set ROPE base theta in AttentionConfig.
It's for Qwen model which use the same architecture of Llama, except base theta value.
Set ROPE base theta in AttentionConfig.
It's for Qwen model which use the same architecture of Llama, except base theta value.