-
Is fast rope exactly equivalent to llama's apply_rotary_pos_emb? I constructed a test case and found that the result is not exactly equivalent. Is there anything wrong with my case
code:
---------…
-
# URL
- https://arxiv.org/abs/2104.09864
# Affiliations
- Jianlin Su, N/A
- Yu Lu, N/A
- Shengfeng Pan, N/A
- Ahmed Murtadha, N/A
- Bo Wen, N/A
- Yunfeng Liu, N/A
# Abstract
- Position e…
-
### Description of the bug:
hi @pkgoogle,
i have some question about computer graph with tinyllama.
- i can't see rotary position encoding in the computer graph, i just can see `tok embedding`.…
-
https://arxiv.org/abs/2104.09864
https://blog.eleuther.ai/rotary-embeddings/
-
I noticed that:
```
def get_rotary_matrix(context_window, embedding_dim):
R = torch.zeros((context_window, embedding_dim, embedding_dim), requires_grad=False)
for position in range(context…
nkkbr updated
5 months ago
-
- https://arxiv.org/abs/2104.09864
- 2021
変換器アーキテクチャにおける位置エンコーディングは、配列中の異なる位置にある要素間の依存関係モデリングのための監督を提供する。
本研究では、変換器ベースの言語モデルで位置情報を符号化するための様々な方法を調査し、Rotary Position Embedding(RoPE)という新しい実装を提案する。
…
e4exp updated
3 years ago
-
Hi, I tried to finetune Llama2-7b-chat model using megatron. I downloaded the hf checkpoint and convert it to GPT megatron checkpoint referring [https://github.com/NVIDIA/Megatron-LM/blob/fe1640a3cc48…
-
[paper](https://arxiv.org/pdf/2104.09864.pdf), [code](https://github.com/huggingface/transformers/blob/v4.28.1/src/transformers/models/roformer/modeling_roformer.py#L318-L343)
## TL;DR
- **I r…
-
### Your current environment
I used 0.4.3 version, pip install, cuda vsesion 12.0, A100 GPU
RuntimeError: t == DeviceType::CUDA INTERNAL ASSERT FAILED
### 🐛 Describe the bug
```
INFO 06-02 03…
-
### Your current environment
```text
Versions of relevant libraries:
[pip3] flashinfer==0.0.9+cu121torch2.3
[pip3] numpy==1.26.4
[pip3] nvidia-nccl-cu12==2.20.5
[pip3] sentence-transformers==3.0…