triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.3k stars 1.48k forks source link

get query_params with python backend #2486

Open chenyangMl opened 3 years ago

chenyangMl commented 3 years ago

Description Thanks for this remarkable work, i deploy model with a variable execpt input tesnor. So i wanna to send this variable via query_params during each infer request.

But i can not find a func or solution of "triton_python_backend_utils".Triton client sends query_params, hot to get query_params at triton server?

Triton Information nvcr.io/nvidia/tritonserver:20.12-py3

To Reproduce

client test code:
with httpclient.InferenceServerClient(self.url) as triton_client:
    response = triton_client.infer(self.model_name,
                                    inputs,
                                    request_id=str(1),
                                    query_params={"test";1},
                                    outputs=outputs)

server test code:
responses = []
for request in requests:
     # how to get query_params of request ?

Expected behavior triton client infer pass query_params, and get query_params at triton server.

deadeyegoodwin commented 3 years ago

The HTTP query params are not available to a python model. Can you describe what per-inference-request information you want to pass into the python?

oeway commented 3 years ago

Can we open this issue? I am also looking for this feature, basically, I need to pass some configuration into the per-inference-request to condition my model. I can potentially create additional inputs but it's a bit overkill and I will have to encode all my settings into tensor.

seongminp commented 2 years ago

Any updates on this?

morestart commented 5 months ago

any news?