ONNX -> TRT - Githubissues

Jimaras08 commented 1 year ago

Hi,

Exporting .pt to .onnx works:

python export.py --weights yolov7-tiny.pt --grid --simplify
Import onnx_graphsurgeon failure: No module named 'onnx_graphsurgeon'
Namespace(batch_size=1, conf_thres=0.25, device='cpu', dynamic=False, dynamic_batch=False, end2end=False, fp16=False, grid=True, img_size=[640, 640], include_nms=False, int8=False, iou_thres=0.45, max_wh=None, simplify=True, topk_all=100, weights='yolov7-tiny.pt')
YOLOR 🚀 2022-12-9 torch 1.12.0+cu102 CPU

Fusing layers... 
Model Summary: 200 layers, 6219709 parameters, 6219709 gradients
/anaconda/envs/azureml_py38/lib/python3.8/site-packages/torch/functional.py:478: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at  ../aten/src/ATen/native/TensorShape.cpp:2895.)
return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]

Starting TorchScript export with torch 1.12.0+cu102...
/mnt/batch/tasks/shared/LS_root/mounts/clusters/inferencing/code/Users/Jimaras08/yolov7/models/yolo.py:52: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if self.grid[i].shape[2:4] != x[i].shape[2:4]:
TorchScript export success, saved as yolov7-tiny.torchscript.pt
Using TensorFlow backend.
/anaconda/envs/azureml_py38/lib/python3.8/site-packages/caffe2/__init__.py:5: UserWarning: Caffe2 support is not fully enabled in this PyTorch build. Please enable Caffe2 by building PyTorch from source with `BUILD_CAFFE2=1` flag.
warnings.warn("Caffe2 support is not fully enabled in this PyTorch build. "
CoreML export failure: module 'coremltools' has no attribute '__version__'

Starting TorchScript-Lite export with torch 1.12.0+cu102...
/mnt/batch/tasks/shared/LS_root/mounts/clusters/inferencing/code/Users/Jimaras08/yolov7/models/yolo.py:52: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if self.grid[i].shape[2:4] != x[i].shape[2:4]:
TorchScript-Lite export success, saved as yolov7-tiny.torchscript.ptl

Starting ONNX export with onnx 1.12.0...
/mnt/batch/tasks/shared/LS_root/mounts/clusters/inferencing/code/Users/Jimaras08/yolov7/models/yolo.py:582: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if augment:
/mnt/batch/tasks/shared/LS_root/mounts/clusters/inferencing/code/Users/Jimaras08/yolov7/models/yolo.py:614: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if profile:
/mnt/batch/tasks/shared/LS_root/mounts/clusters/inferencing/code/Users/Jimaras08/yolov7/models/yolo.py:629: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if profile:

Starting to simplify ONNX...
ONNX export success, saved as yolov7-tiny.onnx

Export complete (10.24s). Visualize with https://github.com/lutzroeder/netron.

Exporting .onnx to .trt doesn't:

python ../tensorrt-python/export.py -o yolov7-tiny.onnx -e yolov7-tiny.trt
Namespace(calib_batch_size=8, calib_cache='./calibration.cache', calib_input=None, calib_num_images=5000, conf_thres=0.4, end2end=False, engine='yolov7-tiny.trt', iou_thres=0.5, max_det=100, onnx='yolov7-tiny.onnx', precision='fp16', verbose=False, workspace=1)
[01/02/2023-17:20:07] [TRT] [I] [MemUsageChange] Init CUDA: CPU +147, GPU +0, now: CPU 166, GPU 112 (MiB)
[01/02/2023-17:20:08] [TRT] [I] [MemUsageChange] Init builder kernel library: CPU +1, GPU +0, now: CPU 186, GPU 112 (MiB)
[01/02/2023-17:20:08] [TRT] [W] onnx2trt_utils.cpp:369: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
Network Description
Input 'images' with shape (1, 3, 640, 640) and dtype DataType.FLOAT
Output 'output' with shape (1, 25200, 85) and dtype DataType.FLOAT
Building fp16 Engine in /mnt/batch/tasks/shared/LS_root/mounts/clusters/inferencing/code/Users/Jimaras08/yolov7/yolov7-tiny.trt
FP16 is not supported natively on this platform/device
[01/02/2023-17:20:09] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +164, GPU +63, now: CPU 377, GPU 175 (MiB)
[01/02/2023-17:20:09] [TRT] [I] [MemUsageChange] Init cuDNN: CPU +87, GPU +33, now: CPU 464, GPU 208 (MiB)
[01/02/2023-17:20:09] [TRT] [W] TensorRT was linked against cuDNN 8.4.1 but loaded cuDNN 8.4.0
[01/02/2023-17:20:09] [TRT] [I] Local timing cache in use. Profiling results in this builder pass will not be stored.
[01/02/2023-17:20:19] [TRT] [W] GPU error during getBestTactic: Conv_5 : an illegal memory access was encountered
[01/02/2023-17:20:19] [TRT] [E] 1: [virtualMemoryBuffer.cpp::~StdVirtualMemoryBufferImpl::104] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[01/02/2023-17:20:19] [TRT] [E] 10: [optimizer.cpp::computeCosts::3626] Error Code 10: Internal Error (Could not find any implementation for node Conv_5.)
[01/02/2023-17:20:19] [TRT] [E] 2: [builder.cpp::buildSerializedNetwork::636] Error Code 2: Internal Error (Assertion engine != nullptr failed. )
Traceback (most recent call last):
File "../tensorrt-python/export.py", line 290, in <module>
  main(args)
File "../tensorrt-python/export.py", line 251, in main
  builder.create_engine(args.engine, args.precision, args.calib_input, args.calib_cache, args.calib_num_images,
File "../tensorrt-python/export.py", line 244, in create_engine
  with self.builder.build_serialized_network(self.network, self.config) as engine, open(engine_path, "wb") as f:
AttributeError: __enter__

I'm working within Azure ML on Standard_NC24 (4 GPU) where 1 GPU = one-half K80 card.

torch=1.12.0
torchvision=0.13.0
nvidia-pyindex=1.0.9
nvidia-tensorrt=8.4.3.1

Any help would be greatly appreciated.

Thank you!

Jimaras08 commented 1 year ago

Probably an issue with K80, see https://github.com/NVIDIA/TensorRT/issues/1816 or https://github.com/NVIDIA/TensorRT/issues/2039. Switching from K80 to P40 worked like a charm. FYI - exporting and inferencing is unlikely to work on different environments!

JunboLi-CN commented 1 year ago

I had an similar issue like yours, I have solved it with old tensorrt version. You can try with "nvidia-tensorrt-8.4.1.5" and may solve it.

BarryGUN commented 4 months ago

GPU memory is not enough to support your convert work

WongKinYiu / yolov7

ONNX -> TRT #1355