ignore_eos_token is commonly used additional parameter to help standardize LLM benchmarks by forcing the requests to generate a consistent output seq len.
-Will this change the current api? How?
It will be adding the ignore_eos_token as additional optional field in the request body.
-Who will benefit from this enhancement?
Anyone who is trying to do benchmark or gain a better understanding of the performance
Description
ignore_eos_token is commonly used additional parameter to help standardize LLM benchmarks by forcing the requests to generate a consistent output seq len.
-Will this change the current api? How?
It will be adding the ignore_eos_token as additional optional field in the request body.
-Who will benefit from this enhancement?
Anyone who is trying to do benchmark or gain a better understanding of the performance
References