Support for xPos positional embedding

jploski commented 1 year ago

xPos is an improved version of the original RoPE from the RoFormer paper (i.e. a modification of ggml_rope with !is_neox flag). I'm unaware of published models using it yet, but it is important because it's the positional embedding employed by the RetNet paper (which offers support for O(1) inference, i.e. independent from context length, with non-degraded quality; a potential superior replacement for attention with kv caches).

For a quick Python comparison of the original RoPE, GPT-NeoX RoPE and xPos RoPE see: https://github.com/jploski/RotaryEmbedding

Based on the above I will create a pull request with an initial (non-CUDA) implementation.

jploski commented 1 year ago

PR: https://github.com/ggerganov/ggml/pull/442

YavorGIvanov commented 1 year ago

Isn't this done ? Please, reopen if I am wrong.

ggerganov / ggml

Support for xPos positional embedding #441