issues
search
feifeibear
/
LLMSpeculativeSampling
Fast inference from large lauguage models via speculative decoding
412
stars
46
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
投机采样的速度比target model速度还要慢
#28
pppyb
opened
2 weeks ago
1
output logits not match. question about decoding when draft model and target model is the same.
#27
66RING
opened
2 months ago
4
KV-cache for Deepmind's speculative sampling
#26
briancpark
closed
4 months ago
1
How to use Tensor Core to accelerate Speculative Sampling?
#25
zhaoyang-star
closed
1 month ago
1
does it support batchsize > 1
#24
GuoYi0
opened
8 months ago
2
使用微调的llm,耗时增加,无法加速
#23
hynnn
closed
2 months ago
2
a change in the output of the model
#22
wenxin-zhu
closed
9 months ago
0
关于指定device的问题
#21
pendulum445
closed
9 months ago
3
add share_gpt benchmarking results
#20
feifeibear
closed
9 months ago
0
fix random seed setting bugs and support multiple GPUs
#19
feifeibear
closed
9 months ago
0
multi-gpu inference
#18
cliangyu
closed
9 months ago
2
add profiler
#17
feifeibear
closed
9 months ago
0
llama 1B performance
#16
cliangyu
closed
9 months ago
4
fix global vars bug
#15
feifeibear
closed
9 months ago
0
correct benchmark results display
#14
feifeibear
closed
9 months ago
0
test llama model
#13
feifeibear
closed
9 months ago
0
test llama model
#12
feifeibear
closed
9 months ago
0
add server
#11
feifeibear
closed
9 months ago
0
add time benchmarking and organize the directory better
#10
feifeibear
closed
9 months ago
0
add requirements.txt
#9
feifeibear
closed
9 months ago
0
fix the random seed argument bug
#8
feifeibear
closed
9 months ago
0
Refactor the KV Cache logic
#7
feifeibear
closed
9 months ago
0
Add KVCache for the google's version
#6
feifeibear
closed
9 months ago
0
remove past_key_values usages, because it will lead to wrong answers
#5
feifeibear
closed
9 months ago
0
add KVCache optimization for AutoRegressive Sampling
#4
feifeibear
closed
10 months ago
0
use kv cache for approx model generates
#3
feifeibear
closed
10 months ago
0
Parallel question
#2
tszdanger
closed
10 months ago
3
speculative sampling 结果和target model输出结果不一致
#1
lawo123
closed
9 months ago
5