tensorrtllm backend fails when kv cache is disabled

x86_64, L4 GPU, debian 11 OS

No response

[ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

To Reproduce

build trtllm engine with trtllm-build --kv_cache_type=disabled
load the model in triton with batching_strategy:inflight_fused_batching, and enable verbose logging
run inference with parallel sessions

triton should work well when kv cache is disabled

There is error

model_instance_state.cc:1117] "Failed updating TRT LLM statistics: Internal - Failed to find Max KV cache blocks in metrics."

and the batch size of trtllm is always 1

N/A

NVIDIA / TensorRT-LLM