WongKinYiu / yolov7

Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
GNU General Public License v3.0
13.27k stars 4.19k forks source link

ONNX -> TRT #1355

Closed Jimaras08 closed 1 year ago

Jimaras08 commented 1 year ago

Hi,

  1. Exporting .pt to .onnx works:

    python export.py --weights yolov7-tiny.pt --grid --simplify
    Import onnx_graphsurgeon failure: No module named 'onnx_graphsurgeon'
    Namespace(batch_size=1, conf_thres=0.25, device='cpu', dynamic=False, dynamic_batch=False, end2end=False, fp16=False, grid=True, img_size=[640, 640], include_nms=False, int8=False, iou_thres=0.45, max_wh=None, simplify=True, topk_all=100, weights='yolov7-tiny.pt')
    YOLOR šŸš€ 2022-12-9 torch 1.12.0+cu102 CPU
    
    Fusing layers... 
    Model Summary: 200 layers, 6219709 parameters, 6219709 gradients
    /anaconda/envs/azureml_py38/lib/python3.8/site-packages/torch/functional.py:478: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at  ../aten/src/ATen/native/TensorShape.cpp:2895.)
    return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
    
    Starting TorchScript export with torch 1.12.0+cu102...
    /mnt/batch/tasks/shared/LS_root/mounts/clusters/inferencing/code/Users/Jimaras08/yolov7/models/yolo.py:52: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
    if self.grid[i].shape[2:4] != x[i].shape[2:4]:
    TorchScript export success, saved as yolov7-tiny.torchscript.pt
    Using TensorFlow backend.
    /anaconda/envs/azureml_py38/lib/python3.8/site-packages/caffe2/__init__.py:5: UserWarning: Caffe2 support is not fully enabled in this PyTorch build. Please enable Caffe2 by building PyTorch from source with `BUILD_CAFFE2=1` flag.
    warnings.warn("Caffe2 support is not fully enabled in this PyTorch build. "
    CoreML export failure: module 'coremltools' has no attribute '__version__'
    
    Starting TorchScript-Lite export with torch 1.12.0+cu102...
    /mnt/batch/tasks/shared/LS_root/mounts/clusters/inferencing/code/Users/Jimaras08/yolov7/models/yolo.py:52: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
    if self.grid[i].shape[2:4] != x[i].shape[2:4]:
    TorchScript-Lite export success, saved as yolov7-tiny.torchscript.ptl
    
    Starting ONNX export with onnx 1.12.0...
    /mnt/batch/tasks/shared/LS_root/mounts/clusters/inferencing/code/Users/Jimaras08/yolov7/models/yolo.py:582: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
    if augment:
    /mnt/batch/tasks/shared/LS_root/mounts/clusters/inferencing/code/Users/Jimaras08/yolov7/models/yolo.py:614: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
    if profile:
    /mnt/batch/tasks/shared/LS_root/mounts/clusters/inferencing/code/Users/Jimaras08/yolov7/models/yolo.py:629: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
    if profile:
    
    Starting to simplify ONNX...
    ONNX export success, saved as yolov7-tiny.onnx
    
    Export complete (10.24s). Visualize with https://github.com/lutzroeder/netron.
  2. Exporting .onnx to .trt doesn't:

    python ../tensorrt-python/export.py -o yolov7-tiny.onnx -e yolov7-tiny.trt
    Namespace(calib_batch_size=8, calib_cache='./calibration.cache', calib_input=None, calib_num_images=5000, conf_thres=0.4, end2end=False, engine='yolov7-tiny.trt', iou_thres=0.5, max_det=100, onnx='yolov7-tiny.onnx', precision='fp16', verbose=False, workspace=1)
    [01/02/2023-17:20:07] [TRT] [I] [MemUsageChange] Init CUDA: CPU +147, GPU +0, now: CPU 166, GPU 112 (MiB)
    [01/02/2023-17:20:08] [TRT] [I] [MemUsageChange] Init builder kernel library: CPU +1, GPU +0, now: CPU 186, GPU 112 (MiB)
    [01/02/2023-17:20:08] [TRT] [W] onnx2trt_utils.cpp:369: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
    Network Description
    Input 'images' with shape (1, 3, 640, 640) and dtype DataType.FLOAT
    Output 'output' with shape (1, 25200, 85) and dtype DataType.FLOAT
    Building fp16 Engine in /mnt/batch/tasks/shared/LS_root/mounts/clusters/inferencing/code/Users/Jimaras08/yolov7/yolov7-tiny.trt
    FP16 is not supported natively on this platform/device
    [01/02/2023-17:20:09] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +164, GPU +63, now: CPU 377, GPU 175 (MiB)
    [01/02/2023-17:20:09] [TRT] [I] [MemUsageChange] Init cuDNN: CPU +87, GPU +33, now: CPU 464, GPU 208 (MiB)
    [01/02/2023-17:20:09] [TRT] [W] TensorRT was linked against cuDNN 8.4.1 but loaded cuDNN 8.4.0
    [01/02/2023-17:20:09] [TRT] [I] Local timing cache in use. Profiling results in this builder pass will not be stored.
    [01/02/2023-17:20:19] [TRT] [W] GPU error during getBestTactic: Conv_5 : an illegal memory access was encountered
    [01/02/2023-17:20:19] [TRT] [E] 1: [virtualMemoryBuffer.cpp::~StdVirtualMemoryBufferImpl::104] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
    [01/02/2023-17:20:19] [TRT] [E] 10: [optimizer.cpp::computeCosts::3626] Error Code 10: Internal Error (Could not find any implementation for node Conv_5.)
    [01/02/2023-17:20:19] [TRT] [E] 2: [builder.cpp::buildSerializedNetwork::636] Error Code 2: Internal Error (Assertion engine != nullptr failed. )
    Traceback (most recent call last):
    File "../tensorrt-python/export.py", line 290, in <module>
      main(args)
    File "../tensorrt-python/export.py", line 251, in main
      builder.create_engine(args.engine, args.precision, args.calib_input, args.calib_cache, args.calib_num_images,
    File "../tensorrt-python/export.py", line 244, in create_engine
      with self.builder.build_serialized_network(self.network, self.config) as engine, open(engine_path, "wb") as f:
    AttributeError: __enter__

    I'm working within Azure ML on Standard_NC24 (4 GPU) where 1 GPU = one-half K80 card.

torch=1.12.0
torchvision=0.13.0
nvidia-pyindex=1.0.9
nvidia-tensorrt=8.4.3.1

Any help would be greatly appreciated.

Thank you!

Jimaras08 commented 1 year ago

Probably an issue with K80, see https://github.com/NVIDIA/TensorRT/issues/1816 or https://github.com/NVIDIA/TensorRT/issues/2039. Switching from K80 to P40 worked like a charm. FYI - exporting and inferencing is unlikely to work on different environments!

JunboLi-CN commented 1 year ago

I had an similar issue like yours, I have solved it with old tensorrt version. You can try with "nvidia-tensorrt-8.4.1.5" and may solve it.

BarryGUN commented 4 months ago

GPU memory is not enough to support your convert work