dust-tt / llama-ssp

Experiments on speculative sampling with Llama models
116 stars 6 forks source link

Draft model for 7B model? #1

Open bryanhpchiang opened 1 year ago

bryanhpchiang commented 1 year ago

If we wanted to do speculative sampling on the 7B Llama model, do you have any recommendations for which (non-Llama) draft model to use? Thanks!

ekmett commented 1 year ago

The paper itself recommends considering using something that just uses a bigram model or even just looks for the current most recent sequence of words in the prompt and carries on with the longest completion from there as a guess The draft model itself also has a weight-retrieval overhang as well, so you can speculate while you speculate.