Encountered an error in forward function: std::bad_cast

System Info

CPU architecture x86_64
GPU NVIDIA A100
TensorRT-LLM branch main
TensorRT-LLM commit 71d8d4d3dc655671f32535d6d2b60cab87f36e87

Who can help?

@juney-nvidia @kaiyux

Information

[x] The official example scripts
[ ] My own modified scripts

Tasks

[ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

Reproduction

Build Image via Docker
Run triton server successfully.

Send request to triton server as the example.

curl -X POST localhost:9000/v2/models/ensemble/generate -d  '{"text_input": "What is machine learning?", "max_tokens": 20, "bad_words": "", "stop_words": ""}'

Expected behavior

Triton server should process the request correctly.

actual behavior

[TensorRT-LLM][ERROR] Encountered an error in forward function: std::bad_cast
[TensorRT-LLM][ERROR] Encountered error for requestId 1804289384: Encountered an error in forward function: std::bad_cast
[TensorRT-LLM][WARNING] Step function failed, continuing.

additional notes

I find ERROR message while starting the server.

[TensorRT-LLM][ERROR] 3: [engine.cpp::getProfileObliviousBindingIndex::1533] Error Code 3: Internal Error (getTensorDataType given invalid tensor name: kv_cache_block_offsets)

triton-inference-server / tensorrtllm_backend

Encountered an error in forward function: std::bad_cast #435

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

actual behavior

additional notes