NVIDIA-AI-IOT / torch2trt

An easy to use PyTorch to TensorRT converter
MIT License
4.6k stars 675 forks source link

The output of Tensorrt is still inconsistent with Pytorch after using contiguous() #385

Closed hyt1407 closed 4 years ago

hyt1407 commented 4 years ago

Hi, when the batch size of x is greater than 1, the effective output cannot be obtained.

x = torch.ones((2, 3, 112, 112)).cuda() 
model = models.resnet18(pretrained=True).eval().cuda()
model_trt = torch2trt(model,[x])

x is continuous:

x = x.contiguous()
print(x.is_contiguous())
# True

However, the output of model and model_trt is still inconsistent

y = model(x)
y_trt = model_trt(x)
print(y)
print(y_trt)
# tensor([[ 1.0878, -0.8497, -0.3348,  ..., -1.3466, -0.1705, -0.5223],
#         [ 1.0878, -0.8497, -0.3348,  ..., -1.3466, -0.1705, -0.5223]],
#        device='cuda:0', grad_fn=<AddmmBackward>)
# tensor([[1.1792e-30, 4.5643e-41, 0.0000e+00,  ..., 0.0000e+00, 0.0000e+00,
#          0.0000e+00],
#         [0.0000e+00, 0.0000e+00, 0.0000e+00,  ..., 0.0000e+00, 0.0000e+00,
#          0.0000e+00]], device='cuda:0')

Confusingly, when I set the size of x to (1,3,112,112), the output of the two models differs little.

x = torch.randn(1,3,112,112).cuda()
y = model(x)
y_trt = model_trt(x)
print(torch.max(torch.abs(y-y_trt))) 
# tensor(8.2254e-06, device='cuda:0', grad_fn=<MaxBackward1>)

TensorRT 6.0.1, CUDA Version 10.0, CuDNN 7.6, PyTorch 1.4.0 Thank you

jaybdub commented 4 years ago

Hi hyt1407,

Thanks for reaching out!

To use larger batch sizes you need to specify the max_batch_size parameter. For example,

model_trt = torch2trt(..., max_batch_size=2)

Apologies for any confusion in the API, the batch size is not inferred from the input data shape.

Please let me know if this helps or you run into issues.

Best, John

hyt1407 commented 4 years ago

@jaybdub It works! Thank you!