speculative-decoding Search Results

1000+ results
for speculative-decoding

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

triton-inference-server/server #7661

ERROR: Failed to create instance: unexpected error when crea…

I am using the `tritonserver:24.08-trtllm-python-py3` image for building and deploying the Llama-3.1-8B-Instruct engine. In my attempt to serve the model with `tritonserver`, I got the following er…

winstxnhdw updated 1 month ago
1
vllm-project/vllm #6893

[Bug]: Stuck at "generating GPU P2P access cache"

### Your current environment ```text PyTorch version: 2.3.1 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Ubuntu 22.04.4 LTS (x86_64) GCC version:…

chenweize1998 updated 3 months ago
9
OpenBMB/MiniCPM #193

vllm运行你们给的demo需要多少显存

你好，我在尝试vllm推理时，运行你们给的inference_vllm.py，代码没改，但是50g显存为什么都会报错out of memory，我也不知道为什么，到底需要多少显存，我使用的是vllm(0.4.2) INFO 08-25 08:17:00 llm_engine.py:100] Initializing an LLM engine (v0.4.2) with config: mode…

lifelsl updated 2 months ago
1
vllm-project/vllm #7699

[Bug]: vLLM inconsistently crashes on startup for multinode …

### Your current environment The output of `python collect_env.py` ```text Collecting environment information... PyTorch version: 2.4.0+cu121 Is debug build: False CUDA used to build PyTorch…

jgreer013 updated 2 months ago
7
vllm-project/vllm #8058

[Bug]: vLLM hang at nccl step when trying to use multiple GP…

### Your current environment The output of `python collect_env.py` ```text PyTorch version: 2.4.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A…

BiboyQG updated 2 months ago
1
RUC-NLPIR/FlashRAG #58

how to change Bfloat16 to float16 (I can only use a GPU with…

when I run the pipeline ``` python run_exp.py --method_name 'naive' \ --split 'test' \ --dataset_name 'nq' \ --gpu_id '0,1,2,3' ``` I get t…

codecodebear updated 2 months ago
5
vllm-project/vllm #8275

[Bug]: RuntimeError: shape mismatch: value tensor of shape […

### Your current environment The output of `python collect_env.py` ```text Collecting environment information... PyTorch v…

Manikandan-Thangaraj-ZS0321 updated 1 month ago
2
PygmalionAI/aphrodite-engine #515

[Bug]: New Numpy version breaks installation

### Your current environment ```text PyTorch version: 2.3.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Ubuntu 22.10 (x86_64) GCC version: (…

murtaza-nasir updated 2 months ago
2
dr460nf1r3/firedragon-browser #93

Youtube videos have massive stutters when h/w accel is ON

Whenever I try to watch a video that is 60fps and 720p or higher on youtube the video turns into a slideshow (the image freezes for like 4 to 5 seconds) if I have hardware acceleration, the audio play…

nickkarv1998 updated 2 months ago
4
vllm-project/vllm #9308

[Bug]: Process group watchdog thread terminated with excepti…

### Your current environment The output of `python collect_env.py` ```text Collecting environment information... PyTorch version: 2.4.0+cu121 Is debug build: False CUDA used to build PyTorch…

eyuansu62 updated 2 weeks ago
2

上一页 1...54 55 56 57 58 59 60...100 下一页

1000+ results for speculative-decoding

1000+ results
for speculative-decoding