Seg fault after loaded models in official example

LeatherDeerAU commented 2 months ago

System Info

arch - x86-64 gpu - rtx3070 docker image nvcr.io/nvidia/tritonserver:24.01-trtllm-python-py3 tensorRT-LLM-backend tag - 0.7.2 tensorRT-LLM tag - 0.7.1 (80bc07510ac4ddf13c0d76ad295cdb2b75614618)

Who can help?

@juney-nvidia

Information

[X] The official example scripts
[ ] My own modified scripts

Tasks

[X] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

Reproduction

use nvcr.io/nvidia/tritonserver:24.01-trtllm-python-py3 image
clone tensorRT-LLM backend 0.7.2 tag
build tensorRT*.whl in container from tensorRT-LLM submodule in backend repo
try launch models from official example

Expected behavior

models uploaded successfully

actual behavior

sig_fault_logs.txt

additional notes

Which backend tag should I use for triton-container version 24-01?

LeatherDeerAU commented 2 months ago

Example models config.pbtxt files: postproccesing.txt preprocessing.txt tensorrt-llm.txt

byshiue commented 2 months ago

Could you share all scripts you use?

triton-inference-server / tensorrtllm_backend

Seg fault after loaded models in official example #425