speculative-decoding Search Results

1000+ results
for speculative-decoding

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

state-spaces/mamba #391

Speculative Decoding with Mamba 1

Hi, I am trying to implement the speculative decoding from [Accelerating Large Language Model Decoding with Speculative Sampling](https://arxiv.org/abs/2302.01318), and below is the code snippet: `…

adityakotha03 updated 4 months ago
1
GENZITSU/UsefulMaterials #151

almost weekly useful materials - 06/06 -

公私共に多忙でろくに更新できていなかった...(すみません

GENZITSU updated 5 months ago
5
vllm-project/vllm #6783

[Bug]: SIGSEGV received at time=1721904360 on cpu 140, Fatal…

### Your current environment My environment setup involving two 8xH100 nodes is detailed in https://github.com/vllm-project/vllm/issues/6775; therefore, I will omit it here for brevity. ### 🐛 De…

eldarkurtic updated 1 week ago
17
OpenBMB/MiniCPM-V #541

[BUG] <title> 我在 vllm v0.5.5 版本运行报错

### 是否已有关于该错误的issue或讨论？ | Is there an existing issue / discussion for this? - [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions ### 该问题是否在FAQ中有解答？ | Is there an existing ans…

fastaipod updated 1 month ago
5
vllm-project/vllm #7624

[Bug]: AttributeError: Model BitsAndBytesModelLoader does no…

### Your current environment LLM : v0.5.4 ``` llm = LLM(model= "unsloth/Qwen2-7B-Instruct-bnb-4bit" , dtype='bfloat16', gpu_memory_utilization=0.95, quantization="bitsandbytes", load_for…

yananchen1989 updated 1 week ago
3
kidliuxu/apple-http-osmf #24

Refactored for multi bitrate support including example

``` http://static.electroteque.org.s3.amazonaws.com/download/apple-osmf.zip Here is the refactored code as a library now with a working example of the m3u8 parsing and multi bitrate setup. I'm not s…

GoogleCodeExporter updated 9 years ago
34
arwaniall/apple-http-osmf #24

Refactored for multi bitrate support including example

``` http://static.electroteque.org.s3.amazonaws.com/download/apple-osmf.zip Here is the refactored code as a library now with a working example of the m3u8 parsing and multi bitrate setup. I'm not s…

GoogleCodeExporter updated 9 years ago
34
vllm-project/vllm #8116

[Bug]: Loading GPTQ-quantized GPTBigCode fails in weight_loa…

### Your current environment The output of `python collect_env.py` ```text Collecting environment information... PyTorch version: 2.4.0+cu121 Is debug build: False CUDA used to build PyTor…

maxdebayser updated 2 months ago
5
intel/xFasterTransformer #475

Illegal instruction (core dumped)

Converting the Qwen2 Model Using the python-c'import xfastertransformer as xft;xft.Qwen2Convert().convert("/tmp/models/Qwen2-1.5B-Instruct", "/tmp/xf_models/qweb2_1.5b_xf")' Command An error is rep…

zwx109473 updated 1 month ago
3
ggerganov/llama.cpp #8499

Bug: Weird output from llama-speculative

### What happened? Hello, llama.cpp experts! Thank you for creating such an amazing LLM Inference system. 😁 **However, while using this system, I encountered an unusual results when checking the spe…

bong-furiosa updated 2 months ago
15

上一页 1...42 43 44 45 46 47 48...100 下一页

1000+ results for speculative-decoding

1000+ results
for speculative-decoding