triton-inference-server / tensorrtllm_backend

The Triton TensorRT-LLM Backend
Apache License 2.0
581 stars 81 forks source link

Encountered an error in forward function: std::bad_cast #435

Open wangqy1216 opened 2 months ago

wangqy1216 commented 2 months ago

System Info

Who can help?

@juney-nvidia @kaiyux

Information

Tasks

Reproduction

  1. Build Image via Docker
  2. Run triton server successfully.
  3. Send request to triton server as the example.
    curl -X POST localhost:9000/v2/models/ensemble/generate -d  '{"text_input": "What is machine learning?", "max_tokens": 20, "bad_words": "", "stop_words": ""}'

Expected behavior

Triton server should process the request correctly.

actual behavior

[TensorRT-LLM][ERROR] Encountered an error in forward function: std::bad_cast
[TensorRT-LLM][ERROR] Encountered error for requestId 1804289384: Encountered an error in forward function: std::bad_cast
[TensorRT-LLM][WARNING] Step function failed, continuing.

additional notes

I find ERROR message while starting the server.

[TensorRT-LLM][ERROR] 3: [engine.cpp::getProfileObliviousBindingIndex::1533] Error Code 3: Internal Error (getTensorDataType given invalid tensor name: kv_cache_block_offsets)
aspctu commented 2 months ago

+1 here. This is happening for me with a medusa model.