speculative-decoding Search Results

800 results
for speculative-decoding

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

pytorch-labs/gpt-fast #107

Questions on Speculative Decoding in gpt-fast generate.py

I'm new to speculative decoding. When I was reading the speculative_decode code (https://github.com/pytorch-labs/gpt-fast/blob/main/generate.py#L88), I have a few questions. Could you please help an…

hxer7963 updated 2 months ago
2
NVIDIA/TensorRT-LLM #1290

v0.8.0 tag trtllm-build does not accept max_draft_len arg

### System Info TensorRT-LLM v0.8.0 branch https://github.com/NVIDIA/TensorRT-LLM/blob/v0.8.0/tensorrt_llm/commands/build.py versus main branch https://github.com/NVIDIA/TensorRT-LLM/blob/main/tenso…

ydm-amazon updated 3 weeks ago
2
OpenDevin/OpenDevin #1854

[Feature]: Implement and test speculative editing

**What problem or use case are you trying to solve?** File editing is not perfect with our current method using SWE-Agent style actions. **Do you have thoughts on the technical implementation?**…

neubig updated 3 days ago
5
sgl-project/sglang #157

Development Roadmap

## Function Calling - Frontend - Add `tools` argument in `sgl.gen`. See also guidance [tools](https://github.com/guidance-ai/guidance/blob/d1bbe1c698cbb201f89556d71193993e78c0686b/README.md?plai…

Ying1123 updated 23 hours ago
14
jzhang38/TinyLlama #186

Llama 3

Given that we have only Llama 3 70B and 8B, it would be useful to have a Tiny Llama based on the Llama 3 tokenizer so that we can use it as a drafting model for speculative decoding. Are there pla…

cduk updated 1 month ago
1
vllm-project/vllm #5825

[RFC]: Classifier-Free Guidance

### Motivation. I am one of the authors of the paper Stay On Topic with Classifier-Free Guidance ( https://openreview.net/forum?id=RiM3cl9MdK&noteId=s1BXLL1YZD ) who has been nominated as ICML'24 Spo…

Vermeille updated 2 days ago
1
vllm-project/vllm #5239

[Performance]: Speculative Performance almost same or lower

### Proposal to improve performance @LiuXiaoxuanPKU Good to see you again. Thank you for your work. I guess your working group releases SD a little by little. I'm wondering about current SD ver…

tolry418 updated 3 weeks ago
4
usersan/papers #50

SPEED: Speculative Pipelined Execution for Efficient Decodin…

## 0. 論文 https://arxiv.org/abs/2310.12072 https://www.arxiv-vanity.com/papers/2310.12072/ [Coleman Hooper](https://arxiv.org/search/cs?searchtype=author&query=Hooper,+C), [Sehoon Kim](https://arx…

tera1k updated 8 months ago
2
QwenLM/Qwen2 #685

Potential use cases for Qwen-0.5B

What are some of the intended use cases for the 0.5B model. There are not a lot of other similar sized models and neither is there a lot of hype around them. Though general audience seems to love th…

Tejaswgupta updated 5 days ago
2
karpathy/nanoGPT #479

Implement multi-token prediction option for models

Per the [recent paper from Meta](https://arxiv.org/abs/2404.19737), it appears that models that predict multiple future tokens can exhibit significantly greater sample efficiency than models trained o…

tmostak updated 2 weeks ago
2

上一页 1...1 2 3 4 5 6 7...80 下一页

800 results for speculative-decoding

800 results
for speculative-decoding