Smorodov / Multitarget-tracker

Multiple Object Tracker, Based on Hungarian algorithm + Kalman filter.
Apache License 2.0
2.2k stars 652 forks source link

errors ocurred when i use the yolov7.onnx #411

Closed sunsunnyshine closed 1 year ago

sunsunnyshine commented 1 year ago

File does not exist : ../../data/yolov7.onnx-kFLOAT-batch1.engine [03/25/2023-10:48:58] [I] [TRT] [MemUsageChange] Init CUDA: CPU +98, GPU +0, now: CPU 8650, GPU 991 (MiB) [03/25/2023-10:49:02] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +139, GPU +22, now: CPU 9264, GPU 1013 (MiB) [03/25/2023-10:49:02] [I] Parsing ONNX file: ../../data/yolov7.onnx [03/25/2023-10:49:02] [I] [TRT] ---------------------------------------------------------------- [03/25/2023-10:49:02] [I] [TRT] Input filename: ../../data/yolov7.onnx [03/25/2023-10:49:02] [I] [TRT] ONNX IR version: 0.0.7 [03/25/2023-10:49:02] [I] [TRT] Opset version: 12 [03/25/2023-10:49:02] [I] [TRT] Producer name: pytorch [03/25/2023-10:49:02] [I] [TRT] Producer version: 2.0.0 [03/25/2023-10:49:02] [I] [TRT] Domain: [03/25/2023-10:49:02] [I] [TRT] Model version: 0 [03/25/2023-10:49:02] [I] [TRT] Doc string: [03/25/2023-10:49:02] [I] [TRT] ---------------------------------------------------------------- [03/25/2023-10:49:03] [W] [TRT] onnx2trt_utils.cpp:377: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. [03/25/2023-10:49:03] [W] [TRT] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped [03/25/2023-10:49:03] [W] [TRT] Tensor DataType is determined at build time for tensors not marked as input or output. workspaceSize = 8589672448, dlaManagedSRAMSize = 0, dlaLocalDRAMSize = 1073741824, dlaGlobalDRAMSize = 536870912 [03/25/2023-10:49:03] [I] Building TensorRT engine: ../../data/yolov7.onnx-kFLOAT-batch1.engine [03/25/2023-10:49:03] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +10, GPU +12, now: CPU 9090, GPU 1025 (MiB) [03/25/2023-10:49:03] [I] [TRT] [MemUsageChange] Init cuDNN: CPU -1, GPU +8, now: CPU 9089, GPU 1033 (MiB) [03/25/2023-10:49:03] [I] [TRT] Local timing cache in use. Profiling results in this builder pass will not be stored. [03/25/2023-10:50:10] [I] [TRT] Some tactics do not have sufficient workspace memory to run. Increasing workspace size will enable more tactics, please check verbose output for requested sizes. [03/25/2023-10:51:49] [I] [TRT] [GraphReduction] The approximate region cut reduction algorithm is called. [03/25/2023-10:53:10] [I] [TRT] Total Activation Memory: 5057725440 [03/25/2023-10:53:10] [I] [TRT] Detected 1 inputs and 1 output network tensors. [03/25/2023-10:53:10] [W] [TRT] Profile kMAX values are not self-consistent. Assertion profile != nullptr failed. need profile [03/25/2023-10:53:10] [E] [TRT] 4: [memoryComputation.cpp::nvinfer1::builder::computeEngineAuxMemorySizes::203] Error Code 4: Internal Error (Profile kOPT values are not self-consistent. Assertion profile != nullptr failed. need profile) [03/25/2023-10:53:11] [E] [TRT] 2: [builder.cpp::nvinfer1::builder::Builder::buildSerializedNetwork::751] Error Code 2: Internal Error (Assertion engine != nullptr failed. )

sunsunnyshine commented 1 year ago

When i used the provided yolov6s.onnx,it can work normally.Is there some problem in the process of transforming the yolov7.pt to yolov7.onnx?But i just used the officially provided export.py code.I'm confused and will be so appreciate if you could give me an advice.

sunsunnyshine commented 1 year ago

when using yolov6: /File does not exist : ../../data/yolov6s.onnx-kFLOAT-batch1.engine [03/25/2023-11:05:06] [I] [TRT] [MemUsageChange] Init CUDA: CPU +84, GPU +0, now: CPU 7965, GPU 991 (MiB) [03/25/2023-11:05:08] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +121, GPU +22, now: CPU 8835, GPU 1013 (MiB) [03/25/2023-11:05:08] [I] Parsing ONNX file: ../../data/yolov6s.onnx [03/25/2023-11:05:08] [I] [TRT] ---------------------------------------------------------------- [03/25/2023-11:05:08] [I] [TRT] Input filename: ../../data/yolov6s.onnx [03/25/2023-11:05:08] [I] [TRT] ONNX IR version: 0.0.6 [03/25/2023-11:05:08] [I] [TRT] Opset version: 12 [03/25/2023-11:05:08] [I] [TRT] Producer name: pytorch [03/25/2023-11:05:08] [I] [TRT] Producer version: 1.8 [03/25/2023-11:05:08] [I] [TRT] Domain: [03/25/2023-11:05:08] [I] [TRT] Model version: 0 [03/25/2023-11:05:08] [I] [TRT] Doc string: [03/25/2023-11:05:08] [I] [TRT] ---------------------------------------------------------------- [03/25/2023-11:05:08] [W] [TRT] onnx2trt_utils.cpp:377: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. [03/25/2023-11:05:08] [W] [TRT] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped [03/25/2023-11:05:08] [W] [TRT] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped [03/25/2023-11:05:08] [W] [TRT] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped workspaceSize = 8589672448, dlaManagedSRAMSize = 0, dlaLocalDRAMSize = 1073741824, dlaGlobalDRAMSize = 536870912 [03/25/2023-11:05:08] [I] Building TensorRT engine: ../../data/yolov6s.onnx-kFLOAT-batch1.engine [03/25/2023-11:05:09] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +15, GPU +12, now: CPU 8574, GPU 1025 (MiB) [03/25/2023-11:05:09] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +4, GPU +8, now: CPU 8578, GPU 1033 (MiB) [03/25/2023-11:05:09] [I] [TRT] Local timing cache in use. Profiling results in this builder pass will not be stored. [03/25/2023-11:05:28] [I] [TRT] Some tactics do not have sufficient workspace memory to run. Increasing workspace size will enable more tactics, please check verbose output for requested sizes. [03/25/2023-11:06:42] [I] [TRT] Total Activation Memory: 4413468672 [03/25/2023-11:06:42] [I] [TRT] Detected 1 inputs and 4 output network tensors. [03/25/2023-11:06:43] [I] [TRT] Total Host Persistent Memory: 86416 [03/25/2023-11:06:43] [I] [TRT] Total Device Persistent Memory: 983552 [03/25/2023-11:06:43] [I] [TRT] Total Scratch Memory: 2048000 [03/25/2023-11:06:43] [I] [TRT] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 30 MiB, GPU 2209 MiB [03/25/2023-11:06:43] [I] [TRT] [BlockAssignment] Started assigning block shifts. This will take 103 steps to complete. [03/25/2023-11:06:43] [I] [TRT] [BlockAssignment] Algorithm ShiftNTopDown took 6.618ms to assign 9 blocks to 103 nodes requiring 24377856 bytes. [03/25/2023-11:06:43] [I] [TRT] Total Activation Memory: 24377856 [03/25/2023-11:06:43] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in building engine: CPU +6, GPU +102, now: CPU 6, GPU 102 (MiB) [03/25/2023-11:06:43] [I] [TRT] Loaded engine size: 102 MiB [03/25/2023-11:06:43] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +102, now: CPU 0, GPU 102 (MiB) [03/25/2023-11:06:43] [I] TRT Engine file saved to: ../../data/yolov6s.onnx-kFLOAT-batch1.engine 4 Bindings: 2 0: name: image_arrays, size: 1x3x640x640 1: name: outputs, size: 1x8400x85 hasImplicitBatchDimension: 0, mBatchSize = 0

sunsunnyshine commented 1 year ago

I've figured out.The reason is dimension becomes unknown when I use NMS before transforming to ONNX.I chose to delete the NMS in the model, transform to ONNX and use onnx_graphsurgeon adding NMS. The specific method is as follows: python export.py --weights yolov7.pt --grid --simplify --topk-all 100 --iou-thres 0.65 --conf-thres 0.35 --img-size 640 640 --max-wh 640 --include-nms But i'm still confused about the reason why NMS didn't work in model.

O(3QALH(0QR`JZP2@EYGOQD

Nuzhny007 commented 1 year ago

Hi! I'm making export with same command: python export.py --weights yolov7.pt --grid --end2end --simplify --topk-all 100 --iou-thres 0.65 --conf-thres 0.35 --img-size 640 640 --include-nms And a part of output:

Note: Producer node(s) of first tensor:
[EfficientNMS_TRT_358 (EfficientNMS_TRT)
    Inputs: [
        Variable (TRT::EfficientNMS_TRT_602): (shape=[1, 25200, 4], dtype=float32)
        Variable (TRT::EfficientNMS_TRT_613): (shape=[1, 25200, 13], dtype=float32)
    ]
    Outputs: [
        Variable (num_dets): (shape=None, dtype=int32)
        Variable (det_boxes): (shape=None, dtype=float32)
        Variable (det_scores): (shape=None, dtype=float32)
        Variable (det_classes): (shape=None, dtype=int32)
    ]
Attributes: OrderedDict([('background_class', [-1]), ('box_coding', [1]), ('iou_threshold', 0.6499999761581421), ('max_output_boxes', 100), ('plugin_version', '1'), ('score_activation', 0), ('score_threshold', 0.3499999940395355)])
Domain: TRT]

And the result model has output tensors:

** Bindings: 5 **
0: name: images, size: 1x3x640x640
1: name: num_dets, size: 1x1
2: name: det_boxes, size: 1x100x4
3: name: det_scores, size: 1x100
4: name: det_classes, size: 1x100
sunsunnyshine commented 1 year ago

Thanks! I think it should be related to third-party library version or GPU version.Other schoolmate tried and got the same error.Just used the Official code provided by Yolov7. O(3QALH(0QR`JZP2@EYGOQD

sunsunnyshine commented 1 year ago

oh!I remember that there are some errors when exporting. CoreML export failure: Core ML only supports tensors with rank <= 5. Layer "model.105.anchor_grid", with type "const", outputs a rank 6 tensor.

Nuzhny007 commented 1 year ago

I'm also using an official yolov7 repository. CoreML part don't used in onnx export. Do you have onnx_graphsurgeon package installed? It used in export: https://github.com/WongKinYiu/yolov7/blob/main/utils/add_nms.py#L3

sunsunnyshine commented 1 year ago

Hi!I check my conda envs pip list.Onnx_graphsurgeon has already installed.

Package Version


absl-py 1.4.0 asttokens 2.2.1 backcall 0.2.0 cachetools 5.3.0 certifi 2022.12.7 charset-normalizer 3.1.0 cmake 3.26.0 coloredlogs 15.0.1 contourpy 1.0.7 coremltools 6.2 cycler 0.11.0 decorator 5.1.1 executing 1.2.0 filelock 3.10.0 flatbuffers 23.3.3 fonttools 4.39.2 google-auth 2.16.2 google-auth-oauthlib 0.4.6 grpcio 1.51.3 humanfriendly 10.0 idna 3.4 ipython 8.11.0 jedi 0.18.2 Jinja2 3.1.2 kiwisolver 1.4.4 lit 15.0.7 Markdown 3.4.1 markdown-it-py 2.2.0 MarkupSafe 2.1.2 matplotlib 3.7.1 matplotlib-inline 0.1.6 mdurl 0.1.2 mpmath 1.3.0 networkx 3.0 numpy 1.23.5 nvidia-cublas-cu11 11.10.3.66 nvidia-cuda-cupti-cu11 11.7.101 nvidia-cuda-nvrtc-cu11 11.7.99 nvidia-cuda-runtime-cu11 11.7.99 nvidia-cudnn-cu11 8.5.0.96 nvidia-cufft-cu11 10.9.0.58 nvidia-curand-cu11 10.2.10.91 nvidia-cusolver-cu11 11.4.0.1 nvidia-cusparse-cu11 11.7.4.91 nvidia-nccl-cu11 2.14.3 nvidia-nvtx-cu11 11.7.91 nvidia-pyindex 1.0.9 oauthlib 3.2.2 onnx 1.13.1 onnx-graphsurgeon 0.3.26 onnx-simplifier 0.4.17 onnxruntime 1.14.1 opencv-python 4.7.0.72 packaging 23.0 pandas 1.5.3 parso 0.8.3 pexpect 4.8.0 pickleshare 0.7.5 Pillow 9.4.0 pip 23.0.1 prompt-toolkit 3.0.38 protobuf 3.20.3 psutil 5.9.4 ptyprocess 0.7.0 pure-eval 0.2.2 pyasn1 0.4.8 pyasn1-modules 0.2.8 Pygments 2.14.0 pyparsing 3.0.9 python-dateutil 2.8.2 pytz 2022.7.1 PyYAML 6.0 requests 2.28.2 requests-oauthlib 1.3.1 rich 13.3.2 rsa 4.9 scipy 1.10.1 seaborn 0.12.2 setuptools 65.6.3 six 1.16.0 stack-data 0.6.2 sympy 1.11.1 tensorboard 2.12.0 tensorboard-data-server 0.7.0 tensorboard-plugin-wit 1.8.1 thop 0.1.1.post2209072238 torch 2.0.0 torchvision 0.15.1 tqdm 4.65.0 traitlets 5.9.0 triton 2.0.0 typing_extensions 4.5.0 urllib3 1.26.15 wcwidth 0.2.6 Werkzeug 2.2.3 wheel 0.38.4

sunsunnyshine commented 1 year ago

Thanks for help!Since that I have solved this problem with other solutions, you don't have to spend to much time thinking about this mysterious problem. Hahaha

sunsunnyshine commented 1 year ago

I've figured out. In yolov7 code image when i use --max_wh,model uses the ONNX_ORT(nnx module with ONNX-Runtime NMS operation.) I should remove the --max_wh in the command line!