Accumulation of tokens while beam_width > 1

tensorrt_llm==0.11.0.dev2024061800

@ncomly-nvidia

[X] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

deploy a model with beam_width > 1 and trtllm backend, request the BLS model with geneate_stream endpoint and stream: true

the accumulate_tokens should be able to True

error thrown: Accumulation of tokens is only implemented for beam width = 1

Maybe all we need to do is enhance the BLS script I think?

triton-inference-server / tensorrtllm_backend