issues
search
feifeibear
/
LLMSpeculativeSampling
Fast inference from large lauguage models via speculative decoding
415
stars
46
forks
source link
remove past_key_values usages, because it will lead to wrong answers
#5
Closed
feifeibear
closed
9 months ago