speculative-decoding Search Results

1000+ results
for speculative-decoding

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

66RING/66RING #1

images

66RING updated 6 months ago
1
ggerganov/llama.cpp #4216

server : improvements and maintenance

The [server](https://github.com/ggerganov/llama.cpp/tree/master/examples/server) example has been growing in functionality and unfortunately I feel it is not very stable at the moment and there are so…

ggerganov updated 2 weeks ago
112
OpenBMB/MiniCPM-V #369

[BUG] <使用vLLM基于Nvidia A10 GPU 运行MiniCPM-Llama3-V-2_5本地推理出现OO…

### 是否已有关于该错误的issue或讨论？ | Is there an existing issue / discussion for this? - [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions ### 该问题是否在FAQ中有解答？ | Is there an existing ans…

leeaction updated 2 months ago
1
ggerganov/llama.cpp #5620

Implementation of speculative streaming

This might be of interest : https://huggingface.co/papers/2402.11131

NickNickGo updated 5 months ago
2
stanfordnlp/dspy #1002

(Fixable?) 400 Error with vLLM API - extra input

Howdy. It seems when running a vLLM server and then attempting to interact with it via `HFClientVLLM`, I get an error message. Here is how to reproduce: ```bash # Computer 1 pip install ray==2.20…

ccruttjr updated 2 weeks ago
14
QwenLM/Qwen2.5 #542

[BUG] AttributeError: 'MergedColumnParallelLinear' object ha…

### 是否已有关于该错误的issue或讨论？ | Is there an existing issue / discussion for this? - [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions ### 该问题是否在FAQ中有解答？ | Is there an existing ans…

jcxcer updated 3 months ago
3
endomorphosis/ipfs_transformers_py #1

support for wasmedge models?

at @onefact we have been using wasm, but this won't work for the encoder-only or encoder-decoder models i've built (e.g. http://arxiv.org/abs/1904.05342). that's because the wasm vm is for the cpu (ha…

jaanli updated 1 month ago
36
vllm-project/vllm #5023

[Bug]: Mistral 7b inst v0.3 fails to run

### Your current environment Using official Docker image. ### 🐛 Describe the bug Using Docker image: vllm/vllm-openai:latest Params: ``` --model=mistralai/Mistral-7B-Instruct-v0.3 --gpu-memo…

yaronr updated 4 months ago
1
vllm-project/vllm #2794

Multi GPU ROCm6 issues, and workarounds

I ran into a series of issues trying to get VLLM stood up on a system with multiple MI210s. I figured I'd document my issues and workarounds so that someone could pick up the baton later, or at least …

BKitor updated 4 weeks ago
8
vllm-project/vllm #6090

[Bug]: ray error when tp>=2

### Your current environment ```text The output of `python collect_env.py` ``` ``` Collecting environment information... PyTorch version: 2.3.0+cu121 Is debug build: False CUDA used to build P…

Jimmy-Lu updated 3 months ago
5

上一页 1...69 70 71 72 73 74 75...100 下一页

1000+ results for speculative-decoding

1000+ results
for speculative-decoding