speculative-decoding Search Results

813 results
for speculative-decoding

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

egnor/pivid #10

Prioritize media loading

Right now, all media loading is done in parallel, which isn't ideal and can result in unnecessary dropped frames (observed by @aubilenon). In an ideal world: - high priority: media frames that wil…

egnor updated 1 year ago
1
jungwoo-ha/WeeklyArxivTalk #70

[20230205] Weekly AI ArXiv 만담 시즌2 - 4회차

### News - Conferences - AAAI 2023: Washington DC (2. 7 - 14) - [Google Cloud가 Anthropic과 손을 잡고 MS + OpenAI 조합에 대항?](https://www.googlecloudpresscorner.com/2023-02-03-Anthropic-Forges-Partnership…

jungwoo-ha updated 1 year ago
4
huggingface/transformers #30608

Some functional problems in the implementation of Speculativ…

### System Info Python 3.10.11 transformers 4.40.0 torch 2.0.1 Linux version 4.15.0-55-generic x86_64 ### Who can help? @ArthurZucker @gante ### Information - [ ] The official example scripts …

transcend-0 updated 1 month ago
6
dilab-zju/self-speculative-decoding #16

Question about self-speculative + greedy decoding

To the best of my knowledge, speculative decoding does not change the decoding result when using greedy decoding. However, I noticed that the rouge2 metrics of 'base' and 'essg' may be different in th…

EganGu updated 2 months ago
2
huggingface/transformers #29869

Speculative Decoding Snippet Not Working

### System Info transformers==4.39.1 python==3.8.17 torch==2.0.1+cpu ### Who can help? @sanchit-gandhi ### Information - [ ] The official example scripts - [ ] My own modified scr…

hieunguyenquoc updated 1 month ago
4
vllm-project/vllm #5013

[Bug]: enable_chunked_prefill feature hangs on AMD Radeon PR…

### Your current environment ``` Collecting environment information... INFO 05-23 16:19:36 pynccl.py:58] Loading nccl from library librccl.so.1 /opt/conda/envs/py_3.9/lib/python3.9/site-packages/t…

hongxiayang updated 1 month ago
5
vllm-project/vllm #5203

[Feature]: inconsistent vocab_sizes support for draft and ta…

### 🚀 The feature, motivation and pitch Currently, vllm with Speculative Decoding requires that the draft model and target model have the same vocab size. However, the target model may have a large…

ShangmingCai updated 1 month ago
3
pytorch/torchtune #945

How to serve fine-tuned model with vllm

After training, the output folder only contain files like `meta_model_0.pt`. If I try to use vllm server to serve this model like this: `python -m vllm.entrypoints.openai.api_server --model finetuned_…

Some-random updated 1 month ago
2
flexflow/FlexFlow #846

error when I use different batch size for opt inference

- I change the [batch size](https://github.com/flexflow/FlexFlow/blob/inference/inference/models/opt.cc#L71) to 2 . - Then I use the below command to execute the opt -6.7b `../build/inference/spec…

lambda7xx updated 11 months ago
2
dart-lang/sdk #25377

UTF8 encoding is slow

The following program encodes that same ASCII string using a naive approach and using actual `UTF8.encode()`. The naive approach is about `3 times` faster. Could UTF8 be optimized to provide better pe…

scheglov updated 2 months ago
7

上一页 1...10 11 12 13 14 15 16...82 下一页

813 results for speculative-decoding

813 results
for speculative-decoding