Open teith opened 11 months ago
+1
In addition to the previously mentioned issue in poll mode, a similar problem occurs in explicit mode.
When using the python tritonclient to execute load_model
for an incorrect model, the interaction with the model repository freezes in the same way. The load_model method ends with a TimeoutError: timed out
.
After this, it becomes impossible to unload the faulty model using unload_model
, and new models, even correct ones, cannot be loaded.
+1
I was able to reproduce this issue on my side. Pleas, note that we don't officially support MAC, but this is reproducible on Linux. I've created a ticket for the team. [Bug: 5944]
Pleas, fix it 🙏🏻
Description Encountered a critical issue with Triton Inference Server in poll mode where the server becomes unresponsive when loading a Python model with errors. Specifically, if a Python model has an import error (e.g., ModuleNotFoundError: No module named 'transformers'), Triton logs this error and then stops processing any further interactions with the model repository, making it impossible to unload this model or load new models.
Triton Information Version: tritonserver:23.11-py3 Using the Triton container
System Information Device: MacBook Pro 14" M2 Pro OS Sonoma 14.1
To Reproduce Steps to reproduce the behavior:
Start Triton Inference Server with --model-repository=/ops/model_repository --model-control-mode=poll --repository-poll-secs=1 --exit-on-error false. Add a Python model to model_repository that has a missing module (e.g., transformers). Observe the error in the server logs and the subsequent inability to interact further with the model repository.
Expected behavior The server is expected to handle such errors gracefully, logging them but still maintaining the ability to manage (load/unload) other models. The polling mechanism should continue to function and allow updates to the model repository without a complete server halt.