triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.37k stars 1.49k forks source link

Handle raw binary request in python #7741

Open remiruzn opened 4 weeks ago

remiruzn commented 4 weeks ago

I faced with problem. Description about use raw binary data here. But i don't understand how i can receive data with my TritonPythonModel class.
Example image send:

import requests
url = "http://localhost:8000/v2/models/inference_pipeline_bytearray/infer"
Read the binary data from a file
with open("image.jpg", "rb") as f:
    binary_data = f.read()

headers = {
    "Content-Type": "application/octet-stream",
    "Inference-Header-Content-Length": "0",
    "Content-Length": str(len(binary_data)),
}

response = requests.post(url, headers=headers, data=binary_data)
print(response.text)`

How i cant receive with my TritonPythonModel?

class TritonPythonModel:
     .....
    def execute(self, requests):
        responses = []
        for request in requests:
             binary_data = ?
        .....