speculative-decoding Search Results

889 results
for speculative-decoding

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

paul-gauthier/aider #625

Explore cursor.sh's 'fast apply' techniques ("fully rewritin…

### Issue There was a recent thread/blogpost about cursor.sh's recent 'fast apply' changes: - https://x.com/amanrsanger/status/1790947733899203027 - https://cursor.sh/blog/instant-apply - > Ou…

0xdevalias updated 1 month ago
6
vllm-project/vllm #4392

[Bug]: Running llama2-7b on H20, Floating point exception (c…

### Your current environment PyTorch version: 2.2.1+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Ubuntu 22.04.4 LTS (x86_64) GCC version: (U…

yk1012664593 updated 5 days ago
11
triton-inference-server/tensorrtllm_backend #469

v0.9.0 tensorrt_llm_bls model return error: Model '${tensorr…

### System Info TensorRT-LLM：v0.9.0 tensorrtllm_backend：v0.9.0 ### Who can help? @kaiyux ### Information - [ ] The official example scripts - [ ] My own modified scripts ### Tasks…

plt12138 updated 2 weeks ago
3
hao-ai-lab/LookaheadDecoding #44

Questions on combined attention mask structure for Jacobi it…

I have some questions about the structure of custom mask for lookahead and verify branches [as described in the blog](https://lmsys.org/blog/2023-11-21-lookahead-decoding/#lookahead-and-verify-in-the…

learning-chip updated 6 months ago
23
vllm-project/vllm #6355

[Installation]: Running ohereForAI/c4ai-command-r-v01 with m…

### Your current environment why is it important: This is a prerequisite to the work on enabling troch.compile on vllm, we need to be able to build vllm with nightly so that we can iterate on chan…

laithsakka updated 3 days ago
9
huggingface/transformers #29869

Speculative Decoding Snippet Not Working

### System Info transformers==4.39.1 python==3.8.17 torch==2.0.1+cpu ### Who can help? @sanchit-gandhi ### Information - [ ] The official example scripts - [ ] My own modified scr…

hieunguyenquoc updated 1 month ago
4
flexflow/FlexFlow #1130

Questions about the measurement of the latency

Hi FlexFlow team, I used the methods mentioned in #1099 to test the latency（GPU: RTX-4090）, but i get a confused result： 1）LLaMA-7B + 1个SSM(llama-160M), latency: 25.1 s 2）LLaMA-7B(without ssms), la…

ChuanhongLi updated 6 months ago
13
huggingface/transformers #30608

Some functional problems in the implementation of Speculativ…

### System Info Python 3.10.11 transformers 4.40.0 torch 2.0.1 Linux version 4.15.0-55-generic x86_64 ### Who can help? @ArthurZucker @gante ### Information - [ ] The official example scripts …

transcend-0 updated 1 month ago
6
vllm-project/vllm #6334

[Bug]: Unable to run phi-3-small in latest release

### Your current environment Running vllm openai docker container on a single A5000 GPU on Runpod. Initialisation settings: `--host 0.0.0.0 --model microsoft/Phi-3-small-8k-instruct --tensor-pa…

ssmi153 updated 1 week ago
4
dilab-zju/self-speculative-decoding #16

Question about self-speculative + greedy decoding

To the best of my knowledge, speculative decoding does not change the decoding result when using greedy decoding. However, I noticed that the rouge2 metrics of 'base' and 'essg' may be different in th…

EganGu updated 2 months ago
2

上一页 1...14 15 16 17 18 19 20...89 下一页

889 results for speculative-decoding

889 results
for speculative-decoding