Open ShuaiShao93 opened 6 months ago
Hi @ShuaiShao93 , thanks a lot for reaching out. Can you provide with the following details
I am unable to reproduce this
When I try to load a model simultaneously it just gets loaded once.
Hi @ShuaiShao93 , thanks a lot for reaching out. Can you provide with the following details
- What type of model/backend?
Ensemble pipeline with Python & ONNX backends
- Can you reproduce this behavior with other types of models/backends? Or is it specific to this one?
Sorry didn't get a chance to test more
- Not sure how are you getting the unloaded log? Are you making a unload request?
No, I just made load requests simultaneously from two clients, and I saw the unloaded logs
I am unable to reproduce this
When I try to load a model simultaneously it just gets loaded once.
@ShuaiShao93 I guess this is an expected behavior in the case of explicit control. If you want to validate if that particular model is loaded before sending the load request, you can always hit the /index endpoint to get the loaded models list.
Description When I use two clients to send
/v2/repository/models/MODEL/load
requests to the same server at the same time, the model is loaded twiceTriton Information What version of Triton are you using? 23.11
Are you using the Triton container or did you build it yourself? Container nvcr.io/nvidia/tritonserver:23.11-py3
To Reproduce Start a server in explicit mode, and load no model.
Open two terminals, run
curl -X POST "http://localhost:8000/v2/repository/models/MODEL/load" -d "{}"
at the same time. You can see logs likeExpected behavior The model should be only loaded once. And the log
successfully unloaded MODEL
should be beforesuccessfully loaded MODEL