Open adrtsang opened 1 month ago
Hi @adrtsang , Could you please provide a minimal reproducer?
It's interesting that the problem goes away when simply reverse the order of inputs and requested_output_names as
inference_request = pb_utils.InferenceRequest(
model_name=self.model_name,
requested_output_names=['output'],
inputs=[input_tensor],
)
Description
Implementing BLS in python backend to send in-flight inference request to another model using c_python_backend_utils.InferenceRequest() and passing in a list of c_python_backend_utils.Tensor objects as input raised an error in pb_stub.cc as follows:
E0920 19:06:18.877435 1 pb_stub.cc:721] "Failed to process the request(s) for model 'whs_inference_model_0_0', message: TypeError: __init__(): incompatible constructor arguments. The following argument types are supported:\n 1. c_python_backend_utils.InferenceRequest(request_id: str = '', correlation_id: object = 0, inputs: List[triton::backend::python::PbTensor], requested_output_names: List[str], model_name: str, model_version: int = -1, flags: int = 0, timeout: int = 0, preferred_memory: c_python_backend_utils.PreferredMemory = <c_python_backend_utils.PreferredMemory object at 0x7f109a7f4230>, trace: c_python_backend_utils.InferenceTrace = <c_python_backend_utils.InferenceTrace object at 0x7f109a7f41f0>, parameters: object = None)\n\nInvoked with: kwargs: model_name='whs_model', inputs=[<c_python_backend_utils.Tensor object at 0x7f109a76f1b0>], requested_output_names=['output'], request_id=1\n\nAt:\n /models/whs_inference_model/1/model.py(229): execute\n"
Triton Information tritonserver:24.07
Are you using the Triton container or did you build it yourself? I built a container based on nvcr.io/nvidia/tritonserver:24.07-py3
To Reproduce Here's a snippet of my model.py
The issue is that the tensor created by pb_utils.Tensor is of type c_python_backend_utils.Tensor object but the input in InferenceRequest() is expected to be a list of triton::backend::python::PbTensor objects. However, passing the c_python_backend_utils.Tensor object to InferenceResponse() is fine. Seems like this is a bug in pb_stub.cc.
Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well). This model.py is running in the inference stage in an ensemble pipeline. I designed the pipeline to perform pre-processing -> inference -> post-processing.
Expected behavior It's expected the pb_utils.InferenceRequest() will accept input of list c_python_backend_utils.Tensor objects