IBM / text-generation-inference

IBM development fork of https://github.com/huggingface/text-generation-inference
Apache License 2.0
57 stars 30 forks source link

feat: support linear scaled rope for tgis_native llama #61

Closed joerunde closed 7 months ago

joerunde commented 7 months ago

Implements a new LinearScalingPositionRotaryEmbedding layer that supports linear scaling of position ids when processing embeddings. Without this, models with a linear rope_scaling configuration could load fine but would give garbage output.

Changes made from inspection of Transformer's LlamaLinearScalingRotaryEmbedding implementation. Basically it just means scaling the position ids before the application of cosine or sine.