triton-inference-server / tensorrtllm_backend

The Triton TensorRT-LLM Backend
Apache License 2.0
663 stars 96 forks source link

how to set `ignore_eos` when benchmark TensorRT LLM #505

Closed zhyncs closed 2 months ago

zhyncs commented 3 months ago

System Info

as titled general question

Who can help?

No response

Information

Tasks

Reproduction

N/A

Expected behavior

N/A

actual behavior

N/A

additional notes

N/A

zhyncs commented 3 months ago

ref vLLM https://docs.vllm.ai/en/latest/dev/sampling_params.html

ignore_eos – Whether to ignore the EOS token and continue generating tokens after the EOS token is generated.

LMDeploy https://lmdeploy.readthedocs.io/en/latest/api/pipeline.html?highlight=ignore_eos

ignore_eos (bool) – Indicator to ignore the eos_token_id or not

TGI https://github.com/huggingface/text-generation-inference/blob/11ea9ce002e796cc59714950b557b4021cbebc58/proto/v3/generate.proto#L111-L113

DeepSpeed-MII https://github.com/microsoft/DeepSpeed-MII

ignore_eos: bool (Defaults to False) Setting to True prevents generation from ending when the EOS token is encountered.

zhyncs commented 2 months ago

Due to the lack of response for a long time, this issue will be closed for now. It can be reopened later if necessary. Thanks.