rotary-position-embedding Search Results

lucidrains/rotary-embedding-torch #30

RoPE embeddings

My conclusions about changing the positional encoding are that NOPE and ALiBi do not work well for only-encoders because, compared to only-decoders, they do not understand position at all (they are …

PRamoneda updated 3 weeks ago

NVIDIA/Megatron-LM #453

[BUG] RoPE ignores position IDs

**Describe the bug** If using `--reset-position-ids`, the RoPE implementation does not take this into account; it will still use the embeddings from position 0 to sequence length - 1. **To Reprodu…

janEbert updated 3 weeks ago

huggingface/text-embeddings-inference #418

Support jinaai/jina-embeddings-v3

### Model description jina-embeddings-v3 is a multilingual multi-task text embedding model designed for a variety of NLP applications. Based on the [Jina-XLM-RoBERTa architecture](https://huggingface…

luonist updated 1 week ago

NVIDIA/TensorRT-LLM #2255

[bug] --use_paged_context_fmha enable broken

My model is ```json { "mlp_bias": false, "attn_bias": false, "rotary_base": 300000, "rotary_scaling": null, "residual_mlp": false, "disable_weight_only_quant_plugin": false, …

akhoroshev updated 1 month ago

unslothai/unsloth #837

Fast Rotary Position Embedding value different to transforme…

I'm testing unsloth rope and here is my script: ```python import torch from unsloth.kernels.rope_embedding import fast_rope_embedding from unsloth.models.llama import LlamaRotaryEmbedding as Uns…

fahadh4ilyas updated 3 months ago

tenstorrent/tt-metal #14107

Support ROPE op for Llama in decode mode

Today, ROPE in decode mode is implemented as a matmul, where rot_mat is precomputed on host based on the sin/cos for each user's position_id. What we want is for the `sin,cos: [max_seq_len=128k, he…

cglagovichTT updated 3 weeks ago

NVIDIA/TensorRT-LLM #2344

When I used convert_checkpoint.py to convert Gemma hf format…

System Info CPU architecture ( x86_64) CPU/Host memory size (64GB) GPU properties GPU name ( NVIDIA RTX4090) GPU memory size (24GB) Libraries TensorRT-LLM branch or tag (v0.13.0) Versions of Tenso…

imilli updated 1 day ago

ollama/ollama #6922

Support for jinaai/jina-embeddings-v3 embedding model

jina-embeddings-v3 is a multilingual multi-task text embedding model designed for a variety of NLP applications. Based on the [Jina-XLM-RoBERTa architecture](https://huggingface.co/jinaai/xlm-roberta-…

sakthi-geek updated 4 days ago

vikhyat/moondream #123

Index out of bounds when used in Open Interpreter

Trying to have Open Interpreter describe images locally. Errors every time. I use a Mac with Apple silicon Not sure if the issue is with how Open Interpreter is passing images to moondream [file](h…

MikeBirdTech updated 1 month ago

vllm-project/vllm #3488

[Bug]: DynamicNTKScalingRotaryEmbedding implementation is di…

### Your current environment ```text The output of `python collect_env.py` ``` ### 🐛 Describe the bug There is a difference in the vLLM implementation of DynamicNTKScalingRotaryEmbedding from t…

killawhale2 updated 2 weeks ago

476 results for rotary-position-embedding

476 results
for rotary-position-embedding