Experiments for speculative_decoding

OptimalScale / LMFlow

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.

https://optimalscale.github.io/LMFlow/

Apache License 2.0

8.23k stars 822 forks source link

Experiments for speculative_decoding #680

Open taoxunqiang opened 10 months ago

taoxunqiang commented 10 months ago

"We tested the speculative inference using the first 100 inputs from alpaca test dataset as prompts. When model=gpt2-xl, draft_model=gpt2".

I want to test speedup for my own model and draft_model. Where can I found the scripts for this?

Thank you in advance.

wheresmyhair commented 10 months ago

"We tested the speculative inference using the first 100 inputs from alpaca test dataset as prompts. When model=gpt2-xl, draft_model=gpt2".

I want to test speedup for my own model and draft_model. Where can I found the scripts for this?

Thank you in advance.

Will work on this ASAP🏃‍♂️. Just for clarification, you want to test spec inference on that specific setting (first 100 inputs from alpaca test dataset as prompts) when model=your_model and draft_model=your_draft_model, or some other settings, or something else?

taoxunqiang commented 10 months ago

"We tested the speculative inference using the first 100 inputs from alpaca test dataset as prompts. When model=gpt2-xl, draft_model=gpt2". I want to test speedup for my own model and draft_model. Where can I found the scripts for this? Thank you in advance.

Will work on this ASAP🏃‍♂️. Just for clarification, you want to test spec inference on that specific setting (first 100 inputs from alpaca test dataset as prompts) when model=your_model and draft_model=your_draft_model, or some other settings, or something else?

Thanks for the reply.
There are no special needs at the moment. I think the first situation is enough to me.

research4pan commented 8 months ago

I am wondering if the problem has been resolved. If you need anything, please feel free to let us know 😄