triton-inference-server / tensorrtllm_backend

The Triton TensorRT-LLM Backend
Apache License 2.0
664 stars 96 forks source link

Deployement failed for BERT #440

Open vivekjoshi556 opened 4 months ago

vivekjoshi556 commented 4 months ago

I have a bert model that I am trying to deploy with Triton Inference Server using Tensorrt-LLM backend. But I am getting errors:

? Docker Image: 24.03 ? TensorRT-LLM: v0.8.0

Error: +-------+---------+-----------------------------------------------------------------------------------------------------------------------+ | Model | Version | Status | +-------+---------+-----------------------------------------------------------------------------------------------------------------------+ | bert | 1 | UNAVAILABLE: Internal: unexpected error when creating modelInstanceState: [json.exception.out_ofrange.403] key 'num | | | | layers' not found | +-------+---------+-----------------------------------------------------------------------------------------------------------------------+

I followed the guide exactly but don't know if the problem is with tensorrt llm or the backend.

byshiue commented 4 months ago

Could you share what guide do you follow?