feifeibear / LLMSpeculativeSampling

Fast inference from large lauguage models via speculative decoding
412 stars 46 forks source link