[Bug] docker install exist assert engine is not None, 'Failed to create TensorRT engine'

chyao7 commented 1 year ago

Checklist

[x] I have searched related issues but cannot get the expected help.
[x] 2. I have read the FAQ documentation but cannot get the expected help.
[x] 3. The bug has not been fixed in the latest version.

Describe the bug

Writing Calibration Cache for calibrator: TRT-8204-EntropyCalibration2 [12/29/2022-10:43:19] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 124) [Shuffle]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [12/29/2022-10:43:19] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer 125) [Softmax]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor [12/29/2022-10:43:21] [TRT] [W] TensorRT was linked against cuBLAS/cuBLASLt 11.6.5 but loaded cuBLAS/cuBLASLt 11.5.1 [12/29/2022-10:43:21] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 1939, GPU 991 (MiB) [12/29/2022-10:43:21] [TRT] [I] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 1939, GPU 999 (MiB) [12/29/2022-10:43:21] [TRT] [I] Local timing cache in use. Profiling results in this builder pass will not be stored. [12/29/2022-10:43:24] [TRT] [E] 1: Unexpected exception None 1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111 Process Process-4: Traceback (most recent call last): File "/opt/conda/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap self.run() File "/opt/conda/lib/python3.8/multiprocessing/process.py", line 108, in run self._target(*self._args, *self._kwargs) File "/root/workspace/mmdeploy/mmdeploy/apis/core/pipeline_manager.py", line 107, in call ret = func(args, **kwargs) File "/root/workspace/mmdeploy/mmdeploy/apis/utils/utils.py", line 95, in to_backend return backend_mgr.to_backend( File "/root/workspace/mmdeploy/mmdeploy/backend/tensorrt/backend_manager.py", line 129, in to_backend onnx2tensorrt( File "/root/workspace/mmdeploy/mmdeploy/backend/tensorrt/onnx2tensorrt.py", line 79, in onnx2tensorrt from_onnx( File "/root/workspace/mmdeploy/mmdeploy/backend/tensorrt/utils.py", line 234, in from_onnx assert engine is not None, 'Failed to create TensorRT engine' AssertionError: Failed to create TensorRT engine 2022-12-29 10:43:26,369 - mmdeploy - ERROR - mmdeploy.apis.utils.utils.to_backend with Call id: 2

Reproduction

deploy convert resnet50 int8

Environment

2022-12-29 10:46:25,063 - mmdeploy - INFO - **********Environmental information**********
2022-12-29 10:46:25,392 - mmdeploy - INFO - sys.platform: linux
2022-12-29 10:46:25,393 - mmdeploy - INFO - Python: 3.8.15 (default, Nov 24 2022, 15:19:38) [GCC 11.2.0]
2022-12-29 10:46:25,393 - mmdeploy - INFO - CUDA available: True
2022-12-29 10:46:25,393 - mmdeploy - INFO - GPU 0,1,2,3,4,5,6,7: Tesla T4
2022-12-29 10:46:25,393 - mmdeploy - INFO - CUDA_HOME: /usr/local/cuda
2022-12-29 10:46:25,393 - mmdeploy - INFO - NVCC: Cuda compilation tools, release 11.6, V11.6.124
2022-12-29 10:46:25,393 - mmdeploy - INFO - GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
2022-12-29 10:46:25,393 - mmdeploy - INFO - PyTorch: 1.10.0
2022-12-29 10:46:25,393 - mmdeploy - INFO - PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.2.3 (Git Hash 7336ca9f055cf1bfa13efb658fe15dc9b41f0740)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX512
  - CUDA Runtime 11.3
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
  - CuDNN 8.2
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.10.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, 

2022-12-29 10:46:25,393 - mmdeploy - INFO - TorchVision: 0.11.0
2022-12-29 10:46:25,393 - mmdeploy - INFO - OpenCV: 4.6.0
2022-12-29 10:46:25,394 - mmdeploy - INFO - MMCV: 1.5.3
2022-12-29 10:46:25,394 - mmdeploy - INFO - MMCV Compiler: GCC 7.3
2022-12-29 10:46:25,394 - mmdeploy - INFO - MMCV CUDA Compiler: 11.3
2022-12-29 10:46:25,394 - mmdeploy - INFO - MMDeploy: 0.11.0+85b7b96
2022-12-29 10:46:25,394 - mmdeploy - INFO - 

2022-12-29 10:46:25,394 - mmdeploy - INFO - **********Backend information**********
2022-12-29 10:46:25,469 - mmdeploy - INFO - tensorrt:   8.2.4.2
2022-12-29 10:46:25,470 - mmdeploy - INFO - tensorrt custom ops:        Available
2022-12-29 10:46:25,501 - mmdeploy - INFO - ONNXRuntime:        None
2022-12-29 10:46:25,501 - mmdeploy - INFO - ONNXRuntime-gpu:    1.8.1
2022-12-29 10:46:25,501 - mmdeploy - INFO - ONNXRuntime custom ops:     Available
2022-12-29 10:46:25,502 - mmdeploy - INFO - pplnn:      None
2022-12-29 10:46:25,503 - mmdeploy - INFO - ncnn:       None
2022-12-29 10:46:25,505 - mmdeploy - INFO - snpe:       None
2022-12-29 10:46:25,505 - mmdeploy - INFO - openvino:   None
2022-12-29 10:46:25,506 - mmdeploy - INFO - torchscript:        1.10.0
2022-12-29 10:46:25,507 - mmdeploy - INFO - torchscript custom ops:     NotAvailable
2022-12-29 10:46:25,544 - mmdeploy - INFO - rknn-toolkit:       None
2022-12-29 10:46:25,545 - mmdeploy - INFO - rknn2-toolkit:      None
2022-12-29 10:46:25,546 - mmdeploy - INFO - ascend:     None
2022-12-29 10:46:25,546 - mmdeploy - INFO - coreml:     None
2022-12-29 10:46:25,547 - mmdeploy - INFO - tvm:        None
2022-12-29 10:46:25,547 - mmdeploy - INFO - 

2022-12-29 10:46:25,547 - mmdeploy - INFO - **********Codebase information**********
2022-12-29 10:46:25,549 - mmdeploy - INFO - mmdet:      None
2022-12-29 10:46:25,549 - mmdeploy - INFO - mmseg:      None
2022-12-29 10:46:25,549 - mmdeploy - INFO - mmcls:      0.25.0
2022-12-29 10:46:25,549 - mmdeploy - INFO - mmocr:      None
2022-12-29 10:46:25,549 - mmdeploy - INFO - mmedit:     None
2022-12-29 10:46:25,549 - mmdeploy - INFO - mmdet3d:    None
2022-12-29 10:46:25,549 - mmdeploy - INFO - mmpose:     None
2022-12-29 10:46:25,549 - mmdeploy - INFO - mmrotate:   None
2022-12-29 10:46:25,549 - mmdeploy - INFO - mmaction:   None

Error traceback

No response

RunningLeon commented 1 year ago

@chyao7 Hi, could post here which model config and checkpoint you are using?

github-actions[bot] commented 1 year ago

This issue is marked as stale because it has been marked as invalid or awaiting response for 7 days without any further response. It will be closed in 5 days if the stale label is not removed or if there is no further response.

github-actions[bot] commented 1 year ago

This issue is closed because it has been stale for 5 days. Please open a new issue if you have similar issues or you have any new updates now.

open-mmlab / mmdeploy