Add RoPE scaling arguments to engine

PygmalionAI / aphrodite-engine

Large-scale LLM inference engine

https://aphrodite.pygmalion.chat

GNU Affero General Public License v3.0

1.05k stars 118 forks source link

Add RoPE scaling arguments to engine #220

Closed AlpinDale closed 1 month ago

AlpinDale commented 8 months ago

Currently, we auto-scale using the --max-model-len argument. It may be more appropriate to have specific options for the scaling factor, etc.

jagilley commented 7 months ago

There are some models for long context tasks like storywriting that it'd be nice to use with a static RoPE scaling factor. +1 on this!

sparsh35 commented 4 months ago

Hi getting error for bigger context models like Microsoft Phi 3 medium with respect to rope scaling factors with exl2 format.

sparsh35 commented 4 months ago

It is something related to this I think, maybe not much needs to be done here, just implement this code , I will try to test if it doesn't breaks anything else , here is the git in vllm for this feature https://github.com/vllm-project/vllm/pull/4638

sparsh35 commented 4 months ago

https://github.com/vllm-project/vllm/pull/4298 vllm has implemented rotatry scale embeddings like this

AlpinDale commented 1 month ago

Added in v0.6.0.