NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
https://developer.nvidia.com/tensorrt
Apache License 2.0
10.84k stars 2.14k forks source link

trt10.5 pytorch-quantization has compile bug #4197

Open lix19937 opened 1 month ago

lix19937 commented 1 month ago

Description

trt10.5 pytorch-quantization has compile bug.

https://github.com/NVIDIA/TensorRT/blob/release/10.5/tools/pytorch-quantization/src/tensor_quant_gpu.cu#L28-L37 define two macro AT_DISPATCH_CASE_FLOATING_TYPES and AT_DISPATCH_FLOATING_TYPES

#define AT_DISPATCH_CASE_FLOATING_TYPES(...)   \
  AT_DISPATCH_CASE(at::ScalarType::Double, __VA_ARGS__)  \
  AT_DISPATCH_CASE(at::ScalarType::Float, __VA_ARGS__)  \
  AT_DISPATCH_CASE(at::ScalarType::Half, __VA_ARGS__)   \
  AT_DISPATCH_CASE(at::ScalarType::BFloat16, __VA_ARGS__)

#define AT_DISPATCH_FLOATING_TYPES(TYPE, NAME, ...) \
  AT_DISPATCH_SWITCH(                                        \
      TYPE, NAME, AT_DISPATCH_CASE_FLOATING_TYPES(__VA_ARGS__))

but in https://github.com/NVIDIA/TensorRT/blob/release/10.5/tools/pytorch-quantization/src/tensor_quant_gpu.cu#L18 #include <ATen/ATen.h> --> #include <ATen/Dispatch.h> --> has already defined these two macros.
I check torch1.13 and torch2.4.1, both the same case.

two macros duplicate definition. @moraxu need use

#undef AT_DISPATCH_CASE_FLOATING_TYPES(...)  
#undef AT_DISPATCH_FLOATING_TYPES(TYPE, NAME, ...)  

before #define in tensor_quant_gpu.cu

Environment

TensorRT Version:10.5

NVIDIA GPU:rtx2000

NVIDIA Driver Version:

CUDA Version:11.8

CUDNN Version:9.1

Operating System:

Python Version (if applicable):3.8

PyTorch Version (if applicable):1.13 or 2.4.1

yuanyao-nv commented 1 month ago

pytorch-quantization development has been discontinued in favor of Model Optimizer since TRT 10.2. Please try that if possible. cc @nzmora-nvidia for opinion on the raised issue.