Linaom1214 / TensorRT-For-YOLO-Series

tensorrt for yolo series (YOLOv10,YOLOv9,YOLOv8,YOLOv7,YOLOv6,YOLOX,YOLOv5), nms plugin support
920 stars 155 forks source link

ONNX -> TRT #77

Closed Jimaras08 closed 1 year ago

Jimaras08 commented 1 year ago

Hi,

  1. Exporting .pt to .onnx works:

    python export.py --weights yolov7-tiny.pt --grid --simplify
    Import onnx_graphsurgeon failure: No module named 'onnx_graphsurgeon'
    Namespace(batch_size=1, conf_thres=0.25, device='cpu', dynamic=False, dynamic_batch=False, end2end=False, fp16=False, grid=True, img_size=[640, 640], include_nms=False, int8=False, iou_thres=0.45, max_wh=None, simplify=True, topk_all=100, weights='yolov7-tiny.pt')
    YOLOR šŸš€ 2022-12-9 torch 1.12.0+cu102 CPU
    
    Fusing layers... 
    Model Summary: 200 layers, 6219709 parameters, 6219709 gradients
    /anaconda/envs/azureml_py38/lib/python3.8/site-packages/torch/functional.py:478: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at  ../aten/src/ATen/native/TensorShape.cpp:2895.)
    return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
    
    Starting TorchScript export with torch 1.12.0+cu102...
    /mnt/batch/tasks/shared/LS_root/mounts/clusters/inferencing/code/Users/Jimaras08/yolov7/models/yolo.py:52: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
    if self.grid[i].shape[2:4] != x[i].shape[2:4]:
    TorchScript export success, saved as yolov7-tiny.torchscript.pt
    Using TensorFlow backend.
    /anaconda/envs/azureml_py38/lib/python3.8/site-packages/caffe2/__init__.py:5: UserWarning: Caffe2 support is not fully enabled in this PyTorch build. Please enable Caffe2 by building PyTorch from source with `BUILD_CAFFE2=1` flag.
    warnings.warn("Caffe2 support is not fully enabled in this PyTorch build. "
    CoreML export failure: module 'coremltools' has no attribute '__version__'
    
    Starting TorchScript-Lite export with torch 1.12.0+cu102...
    /mnt/batch/tasks/shared/LS_root/mounts/clusters/inferencing/code/Users/Jimaras08/yolov7/models/yolo.py:52: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
    if self.grid[i].shape[2:4] != x[i].shape[2:4]:
    TorchScript-Lite export success, saved as yolov7-tiny.torchscript.ptl
    
    Starting ONNX export with onnx 1.12.0...
    /mnt/batch/tasks/shared/LS_root/mounts/clusters/inferencing/code/Users/Jimaras08/yolov7/models/yolo.py:582: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
    if augment:
    /mnt/batch/tasks/shared/LS_root/mounts/clusters/inferencing/code/Users/Jimaras08/yolov7/models/yolo.py:614: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
    if profile:
    /mnt/batch/tasks/shared/LS_root/mounts/clusters/inferencing/code/Users/Jimaras08/yolov7/models/yolo.py:629: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
    if profile:
    
    Starting to simplify ONNX...
    ONNX export success, saved as yolov7-tiny.onnx
    
    Export complete (10.24s). Visualize with https://github.com/lutzroeder/netron.
  2. Exporting .onnx to .trt doesn't:

    python ../tensorrt-python/export.py -o yolov7-tiny.onnx -e yolov7-tiny.trt
    Namespace(calib_batch_size=8, calib_cache='./calibration.cache', calib_input=None, calib_num_images=5000, conf_thres=0.4, end2end=False, engine='yolov7-tiny.trt', iou_thres=0.5, max_det=100, onnx='yolov7-tiny.onnx', precision='fp16', verbose=False, workspace=1)
    [01/02/2023-17:20:07] [TRT] [I] [MemUsageChange] Init CUDA: CPU +147, GPU +0, now: CPU 166, GPU 112 (MiB)
    [01/02/2023-17:20:08] [TRT] [I] [MemUsageChange] Init builder kernel library: CPU +1, GPU +0, now: CPU 186, GPU 112 (MiB)
    [01/02/2023-17:20:08] [TRT] [W] onnx2trt_utils.cpp:369: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
    Network Description
    Input 'images' with shape (1, 3, 640, 640) and dtype DataType.FLOAT
    Output 'output' with shape (1, 25200, 85) and dtype DataType.FLOAT
    Building fp16 Engine in /mnt/batch/tasks/shared/LS_root/mounts/clusters/inferencing/code/Users/Jimaras08/yolov7/yolov7-tiny.trt
    FP16 is not supported natively on this platform/device
    [01/02/2023-17:20:09] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +164, GPU +63, now: CPU 377, GPU 175 (MiB)
    [01/02/2023-17:20:09] [TRT] [I] [MemUsageChange] Init cuDNN: CPU +87, GPU +33, now: CPU 464, GPU 208 (MiB)
    [01/02/2023-17:20:09] [TRT] [W] TensorRT was linked against cuDNN 8.4.1 but loaded cuDNN 8.4.0
    [01/02/2023-17:20:09] [TRT] [I] Local timing cache in use. Profiling results in this builder pass will not be stored.
    [01/02/2023-17:20:19] [TRT] [W] GPU error during getBestTactic: Conv_5 : an illegal memory access was encountered
    [01/02/2023-17:20:19] [TRT] [E] 1: [virtualMemoryBuffer.cpp::~StdVirtualMemoryBufferImpl::104] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
    [01/02/2023-17:20:19] [TRT] [E] 10: [optimizer.cpp::computeCosts::3626] Error Code 10: Internal Error (Could not find any implementation for node Conv_5.)
    [01/02/2023-17:20:19] [TRT] [E] 2: [builder.cpp::buildSerializedNetwork::636] Error Code 2: Internal Error (Assertion engine != nullptr failed. )
    Traceback (most recent call last):
    File "../tensorrt-python/export.py", line 290, in <module>
      main(args)
    File "../tensorrt-python/export.py", line 251, in main
      builder.create_engine(args.engine, args.precision, args.calib_input, args.calib_cache, args.calib_num_images,
    File "../tensorrt-python/export.py", line 244, in create_engine
      with self.builder.build_serialized_network(self.network, self.config) as engine, open(engine_path, "wb") as f:
    AttributeError: __enter__

    I'm working within Azure ML on Standard_NC24 (4 GPU) where 1 GPU = one-half K80 card.

torch=1.12.0
torchvision=0.13.0
nvidia-pyindex=1.0.9
nvidia-tensorrt=8.4.3.1

Any help would be greatly appreciated.

Thank you!

Linaom1214 commented 1 year ago

@Jimaras08

see in this error report

[01/02/2023-17:20:19] [TRT] [E] 10: [optimizer.cpp::computeCosts::3626] Error Code 10: Internal Error (Could not find any implementation for node Conv_5.)

maybe you need update the pytorch version, and I suggest you can use the latest code in my repo.

Jimaras08 commented 1 year ago

Probably an issue with K80, see https://github.com/NVIDIA/TensorRT/issues/1816 or https://github.com/NVIDIA/TensorRT/issues/2039. Switching from K80 to P40 worked like a charm. FYI - exporting and inferencing is unlikely to work on different environments!