lm-sys / FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Apache License 2.0
36.45k stars 4.49k forks source link

Repetition Penalty #2910

Open joeriakkerman opened 8 months ago

joeriakkerman commented 8 months ago

Hi there,

I've come to the conclusion that the field repetition_penalty, which can be found here, is not being used. However, this field is supported by the vllm module.

When I checkout the endpoint which is using this request model, I don't see that this field in the model gets mapped within get_gen_params Is there a reason for this? I'd like to test this field out, but it seems not supported and the default value is 1.0.

Our own created chat model is repeating itself over and over again, so I want to checkout if this fields will help us out. When I checkout the vllm code, the docs state the following:

repetition_penalty: Float that penalizes new tokens based on whether they appear in the prompt and the generated text so far. Values > 1 encourage the model to use new tokens, while values < 1 encourage the model to repeat tokens.

So I guess that this field is just what we need, but I can't seem to give this field a different value than 1.0. Is there a reason for this field not being mapped (anymore?)?

Thanks in advance!

nouf01 commented 7 months ago

Hi, did you find out how to make the model avoid repetition in it's responses?

lin-xiaosheng commented 6 months ago

I've encountered the same problem as well. It seems that Fastchat might not be correctly receiving the incoming repetition_penalty. Has anyone resolved this issue?

ssifood commented 5 months ago

is this solved?