triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
BSD 3-Clause "New" or "Revised" License
8.27k stars 1.47k forks source link

DLPack tensor is not contiguous. Only contiguous DLPack tensors that are stored in C-Order are supported #5816

Closed zhukai242 closed 1 year ago

zhukai242 commented 1 year ago

Description i use model.onnx transform from yolov5s.pt(v6.1) , deploy in triton server, as the model name yolo, and add nms in ensemble

tritonclient.utils.InferenceServerException: [400] in ensemble 'simple_yolov5_ensemble', Failed to process the request(s) for model instance 'nms_0', message: TritonModelException: DLPack tensor is not contiguous. Only contiguous DLPack tensors that are stored in C-Order are supported

Triton Information docker image tag is 23.04-py3

Are you using the Triton container or did you build it yourself? container

To Reproduce Steps to reproduce the behavior. i use model.onnx transform from yolov5s.pt(v6.1) ,as the model name yolo, and add nms in ensemble here is my some code file: ensembel config.pbtxt

name: "simple_yolov5_ensemble" platform: "ensemble" max_batch_size: 8 input [ { name: "ENSEMBLE_INPUT_0" data_type: TYPE_FP32 dims: [1,3, 640, 640] } ]

output [ name: "simple_yolov5_ensemble" { name: "ENSEMBLE_OUTPUT_0" data_type: TYPE_FP32 dims: [ 300, 6 ] } ]

ensemble_scheduling { step [ { model_name: "yolo" model_version: 1 input_map: { key: "images" value: "ENSEMBLE_INPUT_0" } output_map: { key: "output" value: "FILTER_BBOXES" } }, { model_name: "nms" model_version: 1 input_map: { key: "candidate_boxes" value: "FILTER_BBOXES" } output_map: { key: "BBOXES" value: "ENSEMBLE_OUTPUT_0" } } ] }

nms model.py error line is : model.py.txt out_tensor = pb_utils.Tensor.from_dlpack('BBOXES', to_dlpack(bboxes))

here is client code:

inputs.append(httpclient.InferInput('ENSEMBLE_INPUT_0', img.shape, "FP32"))
inputs[0].set_data_from_numpy(img, binary_data=False)
# 输出结果矩阵
outputs = []
outputs.append(httpclient.InferRequestedOutput('ENSEMBLE_OUTPUT_0', binary_data=False))  # 获取 1000 维的向量
results = triton_client.infer('simple_yolov5_ensemble', inputs=inputs, outputs=outputs)
output_data0 = results.as_numpy('ENSEMBLE_OUTPUT_0')
cast=time.time()-start
zhukai242 commented 1 year ago

@dyastremsky @Tabrizian

Tabrizian commented 1 year ago

I don't know what is being done in utils.postprocess but adding .contiguous before returning the tensors would usually solve this issue.

zhukai242 commented 1 year ago

@Tabrizian thanks, this is utils.py where i need to add .contiguous utils.py.txt

Tabrizian commented 1 year ago

@zhukai242 Could you please try again on the 23.05 release? We had some fixes that could be related to this issue.

dyastremsky commented 1 year ago

Closing issue due to inactivity. Please let us know if you need to reopen the issue for follow-up.