long8v / PTIR

Paper Today I Read
19 stars 0 forks source link

[112] RoFormer: Enhanced Transformer with Rotary Position Embedding #121

Open long8v opened 1 year ago

long8v commented 1 year ago
image

paper, code

TL;DR

Details

Related Work : PEs

image

clipping

image

Proposed

image image image

그림은 d=2일 때

image image image

각자 position idx * angle 만큼 회전시키고 나면 attention score를 구했을 때 relative position embedding을 구하는게 됨 Specifically, incorporating the relative position embedding is straightforward: simply rotate the affine-transformed word embedding vector by amount of angle multiples of its position index and thus interprets the intuition behind Rotary Position Embedding.

d차원으로 늘리면

image

Result

image image image
long8v commented 1 year ago

Related work

image

A Length-Extrapolatable Transformer https://arxiv.org/pdf/2212.10554.pdf