Open Leo-yang-1020 opened 6 days ago
We'll support do_sample
in July. Currently, the team is rushing for the version in June.
top_k=1
is equivalent to do_sample=False
We'll support
do_sample
in July. Currently, the team is rushing for the version in June.
Thanks a lot!
Should our API design align with Transformers or OpenAI API?
ref https://platform.openai.com/docs/api-reference/chat/create#chat-create-top_p https://platform.openai.com/docs/api-reference/chat/create#chat-create-temperature
The OpenAI API does not have the top_k
and do_sample
parameters.
For the pipeline
API, we would like to align with transformers
Motivation
When using lmdeploy to inference, sometimes we'd like to set do_sample = false, but according to official document there's no do_sampling config, can we add this just like Automodel? e.g: generation_config = dict( num_beams=1, max_new_tokens=512, do_sample=False, )
Related resources
No response
Additional context
No response