Open dhaval24 opened 1 year ago
Thanks for your feedback. Created an enhancement ticket request
Also, was the intent that the server return a GRPC specific message? Can you show an example? This might be difficult because Triton supports both HTTP and GRPC.
Apologies for a delayed response. The expectation was that - the response code here for graph should be RESOURCE_EXHAUSTED
. https://grpc.github.io/grpc/core/md_doc_statuscodes.html and that should translate to 429 for http response code.
The error correct error code + message should be sufficient. Unavailable is 503 which is not particularly correct as server endpoint is available.
Description Currently triton server doesn't capture the full serialized grpc error message in the message field. proto: https://github.com/triton-inference-server/common/blob/1df32b982a6ed11ead3271a55b04bf6e7abc1cf9/protobuf/grpc_service.proto#L831
The error messages should include the grpc error code, so that they can be reconstructed in upstream applications for correct error handling.
Errors such as below needs to have the error code correctly available to upstream consumers. https://github.com/triton-inference-server/server/blob/b0fb26a0f480950f214ffa1ae1847ca8c5930235/src/core/scheduler_utils.cc#L105-L107
Additionally: the abvoe error message should not be UNAVAILABLE but rather be RESOURCE_EXHAUSTED.
Triton Information 23.04
Are you using the Triton container or did you build it yourself? Triton container
To Reproduce
Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well).
Expected behavior Expected behaviror should be fully serialized grpc error message with error code that can be reconstructed into grpc error correctly.