Closed AlpinDale closed 1 month ago
There are some models for long context tasks like storywriting that it'd be nice to use with a static RoPE scaling factor. +1 on this!
Hi getting error for bigger context models like Microsoft Phi 3 medium with respect to rope scaling factors with exl2 format.
It is something related to this I think, maybe not much needs to be done here, just implement this code , I will try to test if it doesn't breaks anything else , here is the git in vllm for this feature https://github.com/vllm-project/vllm/pull/4638
https://github.com/vllm-project/vllm/pull/4298 vllm has implemented rotatry scale embeddings like this
Added in v0.6.0.
Currently, we auto-scale using the
--max-model-len
argument. It may be more appropriate to have specific options for the scaling factor, etc.