When using Triton server with the --strict-readiness flag set to true, the /v2/health/ready endpoint is expected to return an error code if any models are unloaded. However, after unloading a model via the /v2/repository/models/model/unload endpoint, the /v2/health/ready endpoint still returns a 200 OK status. According to the documentation, this behavior suggests that the server is still reporting as ready despite the model being unloaded, which is incorrect.
Triton Server Version: 28.03
Deployment Method: KServe
KServe Version: 13.0.1
When using Triton server with the --strict-readiness flag set to true, the
/v2/health/ready
endpoint is expected to return an error code if any models are unloaded. However, after unloading a model via the/v2/repository/models/model/unload
endpoint, the /v2/health/ready endpoint still returns a 200 OK status. According to the documentation, this behavior suggests that the server is still reporting as ready despite the model being unloaded, which is incorrect.Triton Server Version: 28.03 Deployment Method: KServe KServe Version: 13.0.1