Open amazingkmy opened 11 months ago
i think that mpi rank is not working properly and seems to be stuck at 0.
I got the same result by replacing the command with the following
mpirun -n 2 --allow-rum-as-root tritonserver --model-repo=/tensorrtllm_backend/triton_model_repo --disable-auto-complete-config
Do you want to ask multi-node
or multi-gpu
? From your description, you test on multi-gpu
. So, I am a little confused.
Besides, can you share the error log of your second test about
mpirun -n 2 --allow-rum-as-root tritonserver --model-repo=/tensorrtllm_backend/triton_model_repo --disable-auto-complete-config
@byshiue i want to ask 'multi-gpu'
Can you share the error log of your second test about
mpirun -n 2 --allow-rum-as-root tritonserver --model-repo=/tensorrtllm_backend/triton_model_repo --disable-auto-complete-con
is support multi node in triton inference server?
i build llama-7b for tensorrtllm_backend and execute triton inference server i have a 4 GPUS but triton inference server load only 1 GPUS
image nvcr.io/nvidia/tritonserver:23.10-trtllm-python-py3
build (llama2)
run