speculative-decoding Search Results

1000+ results
for speculative-decoding

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

vllm-project/vllm #5111

[Bug]: vLLM embeddings example code doesn't work

### Your current environment ```text Collecting environment information... PyTorch version: 2.3.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A …

orionw updated 4 months ago
2
vllm-project/vllm #6126

[Bug]: RuntimeError: No suitable kernel. h_in=16 h_out=7392 …

### Your current environment ```text The output of `python collect_env.py` Collecting environment information... PyTorch version: 2.3.0+cu121 Is debug build: False CUDA used to build PyTorch: 12…

JJJJerry updated 2 months ago
11
ggerganov/llama.cpp #4226

lookahead-prompt : add example

Add an example implementing the "Prompt Lookup Decoding" technique: https://github.com/apoorvumang/prompt-lookup-decoding This should be a great exercise for people looking to become familiar wi…

ggerganov updated 9 months ago
5
linux-surface/linux-surface #51

Surface Laptop 3 AMD Version Touchscreen Not Working

Hi, it's me again. I don't know if you still remember me. I'm the guy who reported touchscreen fault 4 months ago with Surface Laptop 2. This time, my new Surface Laptop 3 AMD Ryzen version w…

swinzy updated 1 month ago
54
golang/go #19623

proposal: spec: change int to be arbitrary precision

An idea that has been kicking around for years, but never written down: The current definition of `int` (and correspondingly `uint`) is that it is either 32 or 64 bits. This causes a variety of pro…

robpike updated 1 month ago
215
dottxt-ai/outlines #638

Check why llama-cpp-python fails some tests for llama-cpp-py…

Some llama integration tests (e.g. `test_llamacpp_various_regexes`) fail for llama-cpp-python >= 0.2.38. Investigate this further.

dtiarks updated 7 months ago
3
ValveSoftware/Proton #5030

Halo Infinite (1240440)

# Compatibility Report - Name of the game with compatibility issues: Halo Infinite - Steam AppID of the game: 1240440 (Believed to be this ID) Note: Creating a preliminary post regarding this …

CDAGaming updated 4 weeks ago
829
vllm-project/vllm #5727

[Bug]: Two V100 server with a total of 16GPU running Distrib…

### Your current environment PyTorch version: 2.3.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Ubuntu 22.04.3 LTS (x86_64) GCC version: …

warlockedward updated 3 months ago
8
vllm-project/vllm #4381

[Bug]: Chunked prefill doesn't seem to work when --kv-cache-…

### Your current environment H100 (but I believe it happens in any machine) ### 🐛 Describe the bug ``` --enable-chunked-prefill --num-max-batched-tokens 2048 --kv-cache-dtype "fp8" ``` S…

rkooo567 updated 1 month ago
11
vllm-project/vllm #8819

[Bug]: Later version have degradation based on `vllm:time_to…

### Your current environment The output of `python collect_env.py` ```text GPU NVIDIA RTX 5880 ``` ### Model Input Dumps _No response_ ### 🐛 Describe the bug I've noticed a…

oandreeva-nv updated 5 days ago
4

上一页 1...80 81 82 83 84 85 86...100 下一页

1000+ results for speculative-decoding

1000+ results
for speculative-decoding