After trained a Q/DQ-inserted fake-quantized resnet, we did model.half() and then torch.jit.trace() to generate this torchscript model. However, it failed to compile.
## Expected behavior
Compile with no error and result matches.
## Environment
> Build information about Torch-TensorRT can be found by turning on debug messages
- Torch-TensorRT Version (e.g. 1.0.0): 1.3.0a0+975f6387
- PyTorch Version (e.g. 1.0): 1.13.0.dev20220921+cu116
- CPU Architecture:
- OS (e.g., Linux):
- How you installed PyTorch (`conda`, `pip`, `libtorch`, source):
- Build command you used (if compiling from source):
- Are you using local sources or building from archives:
- Python version: 3.9
- CUDA version: 11.6
- GPU models and configuration:
- Any other relevant information:
## Additional context
<!-- Add any other context about the problem here. -->
Bug Description
After trained a Q/DQ-inserted fake-quantized resnet, we did
model.half()
and thentorch.jit.trace()
to generate this torchscript model. However, it failed to compile.To Reproduce
Steps to reproduce the behavior:
import torch_tensorrt
trt_model_int8 = torch_tensorrt.ts.compile(resnet50_model, inputs = [ torch_tensorrt.Input( min_shape = tuple([1, 3, 224, 224]), opt_shape = tuple([16, 3, 224, 224]), max_shape = tuple([32, 3, 224, 224]), dtype=torch.float16 ) ], enabled_precisions = torch.int8, workspace_size = 1 << 32, ) trt_model_int8.save("trt_64.pt") trt_model_int8 = torch.jit.load("trt_64.pt") trt_output = trt_model_int8(imgs) resnet50_outputs = resnet50_model(imgs) print(trt_output) print(resnet50_outputs)
print(trt_output[1].dtype) diff = abs(trt_output - resnet50_outputs) print('diff:', diff.mean())