issues
search
feifeibear
/
LLMSpeculativeSampling
Fast inference from large lauguage models via speculative decoding
Apache License 2.0
530
stars
51
forks
source link
use kv cache for approx model generates
#3
Closed
feifeibear
closed
1 year ago