Open jianqiylz opened 6 months ago
@juney-nvidia Hello, could you please take a look at this issue? If you need any further information, please let me know.
@byshiue @juney-nvidia pls
Any updates?
Hi @byshiue @juney-nvidia , did you get the chance to look into this?
@jianqiylz , wanted to check in, were you able to resolve this?
Hello, When trying to run the tritonserver on a setup with 4 nodes, I face an failure that seems to suggest a mismatch between the number of GPUs per node and the tensor parallel (TP) * pipeline parallel (PP) sizes. The error message is as follows:
Environment Information
Steps to Reproduce
Built the Docker image using the following commands:
Compiled the TensorRT engine within a container launched from the above image:
Before running the tests, I administered a workaround in the mapping.py file to align with my hardware:
Tested the multi-node execution with
run.py
. Ran the test with the following command which worked fine and inference results were obtained successfully:Encountered the issue when starting the Triton server with the below command:
From what I could observe, it seems like the tritonserver may not yet support run across multiple nodes. If my understanding is correct, could you please consider adding support for this functionality, or provide any guidance on how I can work around this limitation?
Sincerely appreciate any assistance you can offer on this matter.