speculative-decoding Search Results

1000+ results
for speculative-decoding

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

Dao-AILab/flash-attention #660

Any plan for support paged attention?

First of all, thank you for the great work! Is there any plan to support paged kv cache in non-contiguous memory? For instance, in flash_attn_with_kvcache?

donglinz updated 2 months ago
13
ROCm/omniperf #381

Profiling execution error on 2x MI100 system

**Describe the bug** I am unable to profile my workload. **Development Environment:** - Linux Distribution: Docker Container running Ubuntu 22.04 - Omniperf Version: 2.0.1 (release) - GPU: …

aymane-eljerari updated 2 months ago
2
haskell/aeson #546

Should we make a clear value-level distinction between float…

This basic subject has been variously discussed before – https://github.com/bos/aeson/issues/227 https://github.com/bos/aeson/issues/181 The thing is that Aeson can't make a distinction between…

leftaroundabout updated 5 years ago
31
flashinfer-ai/flashinfer #362

RuntimeError: Out of workspace memory in AlignedAlloactor wh…

Hi, I encountered an out of workspace memory error when trying to load the gemma-2-27b model using vllm with the flashinfer backend, which seems to have come from flashinfer. I printed out the GPU mem…

jl3676 updated 1 month ago
30
THUDM/GLM-4 #173

运行openai_api_server.py 显示内存不足

### System Info / 系統信息 Cuda 12.3 python 3.11.5 centos 7 p40 显卡三张 ### Who can help? / 谁可以帮助到您？ _No response_ ### Information / 问题信息 - [X] The official example scripts / 官方的示例脚本 - [ ] My own mo…

lesrose updated 3 months ago
6
dottxt-ai/outlines #276

Add speculative decoding

I'm keen on adding [speculative decoding](https://arxiv.org/abs/2211.17192) to outlines. Is this something that is being worked on? Otherwise I would be happy to submit a PR but I'd need some advic…

SamDuffield updated 7 months ago
3
BrowserWorks/Waterfox #3449

Right click broken with new update

### What happened? Ever since I installed the new update, I can't right click on any pages in Waterfox! ### Reproducible? - [ ] I have checked that this issue cannot be reproduced on Mozilla Firef…

charlie309 updated 3 months ago
1
SafeAILab/EAGLE #45

RuntimeError: probability tensor contains either `inf`, `nan…

Great work! I tried your [example](https://github.com/SafeAILab/EAGLE#:~:text=llama%2D2%2Dchat%5D-,With%20Code,-You%20can%20use) for llama-7b-chat and changed the tree structure in choices.py into …

cyLi-Tiger updated 5 months ago
5
intel-analytics/ipex-llm #11469

Run Llama-2-7B-Chat-hf bigdl_ipex_bf16 Error on Xeon SPR

ipex-llm/python/dev/benchmark/all-in-one/run-srp.sh Run Llama-2-7B-Chat-hf bigdl_ipex_bf16 Error on Intel(R) Xeon(R) w9-3475X config.yaml ``` repo_id: # - 'THUDM/chatglm2-6b' - 'meta-l…

biyuehuang updated 2 months ago
2
mozilla/multi-account-containers #2567

Containers don't sync to new device, then the default config…

### Before submitting a bug report - [X] I updated to the latest version of Multi-Account Container and tested if I can reproduce the issue - [X] I searched for existing reports to see if it hasn't a…

use updated 4 months ago
2

上一页 1...58 59 60 61 62 63 64...100 下一页

1000+ results for speculative-decoding

1000+ results
for speculative-decoding