romsto / Speculative-Decoding

Implementation of the paper Fast Inference from Transformers via Speculative Decoding, Leviathan et al. 2023.
MIT License
11 stars 0 forks source link