speculative-decoding Search Results

1000+ results
for speculative-decoding

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

vllm-project/vllm #7366

[RFC]: Encoder/decoder models & feature compatibility

## Motivation # There is significant interest in vLLM supporting encoder/decoder models. Issues #187 and #180 , for example, request encoder/decoder model support. As a result encoder/decoder supp…

afeldman-nm updated 1 day ago
12
vllm-project/vllm #7548

[Bug]: NCCL error: invalid usage (run with NCCL_DEBUG=WARN f…

### Your current environment The output of `python collect_env.py` ```text root@newllm201:/workspace# vim collect.py root@newllm201:/workspace# python3 collect.py Collecting environment info…

zhaotyer updated 2 months ago
5
pytorch/pytorch #139064

[Flex Attention] Cannot determine truth value of Relational

### 🐛 Describe the bug Flex attention with dynamic shapes stumbles upon comparing Relational expressions. I found two places of this error. One in `flex_decoding.py`: ``` File "/usr/local/li…

alexdremov updated 3 hours ago
9
vllm-project/vllm #8620

[Bug]: vllm deploy medusa, draft acceptance rate: 0.000

### Your current environment vllm==0.6.1 ### Model Input Dumps when i use medusa train, medusa0,medusa1,medusa2 acc has 0.95, train result is ok, but i try vllm to delpoy medusa, deploy is…

xhjcxxl updated 1 month ago
3
turboderp/exllamav2 #564

Speculative Decoding not working with ExLlamaV2DynamicGenera…

Is the ExLlamaV2DynamicGeneratorAsync not working with speculative decoding? Hope that it is something i did wrongly instead cause i really want to use it ```python import sys, os # sys.path.appe…

remichu-ai updated 3 months ago
1
vllm-project/vllm #4756

[Bug]: CUDA error when running mistral-7b + lora with tensor…

### Your current environment ``` PyTorch version: 2.3.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Ubuntu 22.04.4 LTS (x86_64) GCC versio…

sfc-gh-zhwang updated 1 week ago
7
vllm-project/vllm #6355

[Installation]: Running ohereForAI/c4ai-command-r-v01 with m…

### Your current environment why is it important: This is a prerequisite to the work on enabling troch.compile on vllm, we need to be able to build vllm with nightly so that we can iterate on chan…

laithsakka updated 1 week ago
13
intel/intel-extension-for-pytorch #488

Whisper: could not create a primitive descriptor for a conca…

### Describe the issue Hi, Having an issue trying to run Whisper on a A770 with XPU selected as the device with the following environment that works with CPU set as the device. Any insight is ap…

JKlesmith updated 9 months ago
2
vllm-project/vllm #8732

[Bug]: TypeError: 'NoneType' object is not subscriptable RP…

### Your current environment The output of `python collect_env.py` ```text Collecting environment information... WARNING 09-23 09:07:16 _custom_ops.py:18] Failed to import from vllm._C with …

vikyw89 updated 3 weeks ago
4
vllm-project/vllm #6861

[Bug]: Can't load BNB model

### Your current environment ```text The output of `python collect_env.py` Collecting environment information... PyTorch version: 2.3.1+cu121 Is debug build: False CUDA used to build PyTorch: 12…

eldarkurtic updated 1 week ago
2

上一页 1...34 35 36 37 38 39 40...100 下一页

1000+ results for speculative-decoding

1000+ results
for speculative-decoding