open-mmlab / mmsegmentation

OpenMMLab Semantic Segmentation Toolbox and Benchmark.
https://mmsegmentation.readthedocs.io/en/main/
Apache License 2.0
7.78k stars 2.54k forks source link

Triton Server Inference Grid Sampler Error #1372

Open sarperkilic opened 2 years ago

sarperkilic commented 2 years ago

Hello,

I have trained a model in mmsegmentation. (Pointrend)

I can use this model to inference with jit inference. When I send to inference request to Triton inference server, I got an error.

Model extension is .pt.

I can deploy the model but when I try to send an inference request to the model I got following error:

tritonclient.utils.InferenceServerException: PyTorch execute failure: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
  File "code/__torch__/mmseg/models/segmentors/cascade_encoder_decoder.py", line 93, in forward
    grid = torch.unsqueeze(point_coords, 2)
    grid0 = torch.sub(torch.mul(grid, CONSTANTS.c1), CONSTANTS.c0, alpha=1)
    output1 = torch.grid_sampler(_18, grid0, 0, 0, False)
              ~~~~~~~~~~~~~~~~~~ <--- HERE
    fine_grained_point_feats = torch.squeeze(output1, 3)
    grid1 = torch.unsqueeze(point_coords, 2)

pytorch2torchscript.py(122): pytorch2libtorch
pytorch2torchscript.py(187): <module>
RuntimeError: grid_sampler(): expected input and grid to be on same device, but input is on cuda:0 and grid is on cpu

I am also sharing my client script below:

triton_client = httpclient.InferenceServerClient(url='localhost:8000', verbose=False) 
model_name = "segmentation_model"
img = './roi.png'

img=cv2.imread(img,cv2.IMREAD_COLOR) 
chw_image_data= np.transpose(img, (2,0,1))  
img_tensor = (transforms(chw_image_data))
img_tensor.cuda()
chw_image_data = img_tensor.numpy()  
chw_image_data= np.transpose(chw_image_data, (1,2,0))  
chw_image_data = chw_image_data[np.newaxis,...] 
npdtype = triton_to_np_dtype("FP32")
typed = chw_image_data.astype(npdtype)

inputs = [
    httpclient.InferInput("input__0", typed.shape, "FP32")
]
inputs[0].set_data_from_numpy(typed)

outputs = [
    httpclient.InferRequestedOutput("output__0")
]

response = triton_client.infer(model_name,
                            inputs,
                            request_id=str(1),
                            outputs=outputs)

result = response.get_response()

What should add for this error.

Thanks

RunningLeon commented 2 years ago

@sarperkilic Hi, you could try to put all input tensor of grid_sample on the same device when converting pytorch to torchscript.