OptimalScale / LMFlow

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
https://optimalscale.github.io/LMFlow/
Apache License 2.0
8.29k stars 827 forks source link

Experiments for speculative_decoding #680

Open taoxunqiang opened 1 year ago

taoxunqiang commented 1 year ago

"We tested the speculative inference using the first 100 inputs from alpaca test dataset as prompts. When model=gpt2-xl, draft_model=gpt2".

I want to test speedup for my own model and draft_model. Where can I found the scripts for this?

Thank you in advance.

wheresmyhair commented 1 year ago

"We tested the speculative inference using the first 100 inputs from alpaca test dataset as prompts. When model=gpt2-xl, draft_model=gpt2".

I want to test speedup for my own model and draft_model. Where can I found the scripts for this?

Thank you in advance.

Will work on this ASAP🏃‍♂️. Just for clarification, you want to test spec inference on that specific setting (first 100 inputs from alpaca test dataset as prompts) when model=your_model and draft_model=your_draft_model, or some other settings, or something else?

taoxunqiang commented 1 year ago

"We tested the speculative inference using the first 100 inputs from alpaca test dataset as prompts. When model=gpt2-xl, draft_model=gpt2". I want to test speedup for my own model and draft_model. Where can I found the scripts for this? Thank you in advance.

Will work on this ASAP🏃‍♂️. Just for clarification, you want to test spec inference on that specific setting (first 100 inputs from alpaca test dataset as prompts) when model=your_model and draft_model=your_draft_model, or some other settings, or something else?

Thanks for the reply.
There are no special needs at the moment. I think the first situation is enough to me.

research4pan commented 10 months ago

I am wondering if the problem has been resolved. If you need anything, please feel free to let us know 😄