Following document to reproduce result failed

scottzhang commented 7 months ago

Hello. I am trying to reproduce this project. At the moment stuck at Export checkpoint to onnx model: Clone yolov7-face-detection and cd into yolov7-face-detection folder Download weight and save into weights/yolov7-tiny33.pt Export to onnx: python3 export.py --weights ./weights/yolov7-tiny33.pt --img-size 640 --batch-size 1 --dynamic-batch --grid --end2end --max-wh 640 --topk-all 100 --iou-thres 0.5 --conf-thres 0.2 --device 1 --simplify --cleanup --trt Or download onnx file from from github.com/hiennguyen9874/yolov7-face-detection/releases/tag/v0.1 Export to TensorRT: /usr/src/tensorrt/bin/trtexec --onnx=samples/models/Primary_Detector/yolov7-tiny41-nms-trt.onnx --saveEngine=samples/engines/Primary_Detector/yolov7-tiny41-nms-trt.trt --workspace=14336 --fp16 --minShapes=images:1x3x640x640 --optShapes=images:1x3x640x640 --maxShapes=images:4x3x640x640 --shapes=images:1x3x640x640

If i Use python3 export.py --weights ./weights/yolov7-tiny33.pt --img-size 640 --batch-size 1 --dynamic-batch --grid --end2end --max-wh 640 --topk-all 100 --iou-thres 0.5 --conf-thres 0.2 --device 0 --simplify --cleanup --trt then I get: Traceback (most recent call last): File "/root/yolov7-face-detection/export.py", line 171, in torch.onnx.export( File "/usr/local/miniconda3/lib/python3.10/site-packages/torch/onnx/utils.py", line 516, in export _export( File "/usr/local/miniconda3/lib/python3.10/site-packages/torch/onnx/utils.py", line 1613, in _export graph, params_dict, torch_out = _model_to_graph( File "/usr/local/miniconda3/lib/python3.10/site-packages/torch/onnx/utils.py", line 1135, in _model_to_graph graph, params, torch_out, module = _create_jit_graph(model, args) File "/usr/local/miniconda3/lib/python3.10/site-packages/torch/onnx/utils.py", line 1011, in _create_jit_graph graph, torch_out = _trace_and_get_graph_from_model(model, args) File "/usr/local/miniconda3/lib/python3.10/site-packages/torch/onnx/utils.py", line 915, in _trace_and_get_graph_from_model trace_graph, torch_out, inputs_states = torch.jit._get_trace_graph( File "/usr/local/miniconda3/lib/python3.10/site-packages/torch/jit/_trace.py", line 1296, in _get_trace_graph outs = ONNXTracedModule( File "/usr/local/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "/usr/local/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, kwargs) File "/usr/local/miniconda3/lib/python3.10/site-packages/torch/jit/_trace.py", line 138, in forward graph, out = torch._C._create_graph_by_tracing( File "/usr/local/miniconda3/lib/python3.10/site-packages/torch/jit/_trace.py", line 129, in wrapper outs.append(self.inner(trace_inputs)) File "/usr/local/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/usr/local/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, kwargs) File "/usr/local/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _slow_forward result = self.forward(*input, kwargs) File "/root/yolov7-face-detection/models/experimental.py", line 315, in forward x = self.end2end(x) File "/usr/local/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/usr/local/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(args, kwargs) File "/usr/local/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _slow_forward result = self.forward(*input, **kwargs) File "/root/yolov7-face-detection/models/experimental.py", line 256, in forward lmks_mask = x[:, :, [8, 11, 14, 17, 20]] RuntimeError: CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

If I use /usr/src/tensorrt/bin/trtexec --onnx=samples/models/Primary_Detector/yolov7-tiny41-nms-trt.onnx --saveEngine=samples/engines/Primary_Detector/yolov7-tiny41-nms-trt.trt --workspace=14336 --fp16 --minShapes=images:1x3x640x640 --optShapes=images:1x3x640x640 --maxShapes=images:4x3x640x640 --shapes=images:1x3x640x640 Then I get:

[02/06/2024-03:43:34] [W] [TRT] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. [02/06/2024-03:43:34] [I] [TRT] No importer registered for op: EfficientNMSLandmark_TRT. Attempting to import as plugin. [02/06/2024-03:43:34] [I] [TRT] Searching for plugin: EfficientNMSLandmark_TRT, plugin_version: 1, plugin_namespace: [02/06/2024-03:43:34] [E] [TRT] 3: getPluginCreator could not find plugin: EfficientNMSLandmark_TRT version: 1 [02/06/2024-03:43:34] [E] [TRT] ModelImporter.cpp:771: While parsing node number 195 [EfficientNMSLandmark_TRT -> "num_dets"]: [02/06/2024-03:43:34] [E] [TRT] ModelImporter.cpp:772: --- Begin node --- [02/06/2024-03:43:34] [E] [TRT] ModelImporter.cpp:773: input: "441" input: "457" input: "456" output: "num_dets" output: "det_boxes" output: "det_scores" output: "det_classes" output: "det_lmks" name: "EfficientNMSLandmark_TRT_294" op_type: "EfficientNMSLandmark_TRT" attribute { name: "background_class" ints: -1 type: INTS } attribute { name: "box_coding" ints: 1 type: INTS } attribute { name: "iou_threshold" f: 0.5 type: FLOAT } attribute { name: "max_output_boxes" i: 100 type: INT } attribute { name: "plugin_version" s: "1" type: STRING } attribute { name: "score_activation" i: 0 type: INT } attribute { name: "score_threshold" f: 0.2 type: FLOAT }

[02/06/2024-03:43:34] [E] [TRT] ModelImporter.cpp:774: --- End node --- [02/06/2024-03:43:34] [E] [TRT] ModelImporter.cpp:777: ERROR: builtin_op_importers.cpp:5404 In function importFallbackPluginImporter: [8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?" [02/06/2024-03:43:34] [E] Failed to parse onnx file [02/06/2024-03:43:34] [I] Finished parsing network model. Parse time: 0.143645 [02/06/2024-03:43:34] [E] Parsing model failed [02/06/2024-03:43:34] [E] Failed to create engine from model or file. [02/06/2024-03:43:34] [E] Engine set up failed &&&& FAILED TensorRT.trtexec [TensorRT v8601] # /usr/src/tensorrt/bin/trtexec --onnx=samples/models/Primary_Detector/yolov7-tiny41-nms-trt.onnx --saveEngine=samples/engines/Primary_Detector/yolov7-tiny41-nms-trt.trt --workspace=14336 --fp16 --minShapes=images:1x3x640x640 --optShapes=images:1x3x640x640 --maxShapes=images:4x3x640x640 --shapes=images:1x3x640x640

What was going wrong?

hiennguyen9874 commented 4 months ago

Using Dockerfile to run or install custom TensorRT plugin github.com/hiennguyen9874/deepstream-face-recognition/blob/8afad7e13740e8425c69a9b867f1cad06857457e/Dockerfile#L67

thalapandi commented 4 weeks ago

/usr/src/tensorrt/bin/trtexec --onnx=samples/models/Primary_Detector/yolov7-tiny41-nms-trt.onnx --saveEngine=samples/engines/Primary_Detector/yolov7-tiny41-nms-trt.trt --workspace=14336 --fp16 --minShapes=images:1x3x640x640 --optShapes=images:1x3x640x640 --maxShapes=images:4x3x640x640 --shapes=images:1x3x640x640 Then I get:

[02/06/2024-03:43:34] [W] [TRT] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. [02/06/2024-03:43:34] [I] [TRT] No importer registered for op: EfficientNMSLandmark_TRT. Attempting to import as plugin. [02/06/2024-03:43:34] [I] [TRT] Searching for plugin: EfficientNMSLandmark_TRT, plugin_version: 1, plugin_namespace: [02/06/2024-03:43:34] [E] [TRT] 3: getPluginCreator could not find plugin: EfficientNMSLandmark_TRT version: 1 [02/06/2024-03:43:34] [E] [TRT] ModelImporter.cpp:771: While parsing node number 195 [EfficientNMSLandmark_TRT -> "num_dets"]: [02/06/2024-03:43:34] [E] [TRT] ModelImporter.cpp:772: --- Begin node --- [02/06/2024-03:43:34] [E] [TRT] ModelImporter.cpp:773: input: "441" input: "457" input: "456" output: "num_dets" output: "det_boxes" output: "det_scores" output: "det_classes" output: "det_lmks" name: "EfficientNMSLandmark_TRT_294" op_type: "EfficientNMSLandmark_TRT" attribute { name: "background_class" ints: -1 type: INTS } attribute { name: "box_coding" ints: 1 type: INTS } attribute { name: "iou_threshold" f: 0.5 type: FLOAT } attribute { name: "max_output_boxes" i: 100 type: INT } attribute { name: "plugin_version" s: "1" type: STRING } attribute { name: "score_activation" i: 0 type: INT } attribute { name: "score_threshold" f: 0.2 type: FLOAT }

[02/06/2024-03:43:34] [E] [TRT] ModelImporter.cpp:774: --- End node --- [02/06/2024-03:43:34] [E] [TRT] ModelImporter.cpp:777: ERROR: builtin_op_importers.cpp:5404 In function importFallbackPluginImporter: [8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?" [02/06/2024-03:43:34] [E] Failed to parse onnx file [02/06/2024-03:43:34] [I] Finished parsing network model. Parse time: 0.143645 [02/06/2024-03:43:34] [E] Parsing model failed [02/06/2024-03:43:34] [E] Failed to create engine from model or file. [02/06/2024-03:43:34] [E] Engine set up failed &&&& FAILED TensorRT.trtexec [TensorRT v8601] # /usr/src/tensorrt/bin/trtexec --onnx=samples/models/Primary_Detector/yolov7-tiny41-nms-trt.onnx --saveEngine=samples/engines/Primary_Detector/yolov7-tiny41-nms-trt.trt --workspace=14336 --fp16 --minShapes=images:1x3x640x640 --optShapes=images:1x3x640x640 --maxShapes=images:4x3x640x640 --shapes=images:1x3x640x640

What was going wrong?

getting same error for me so how to resolve the issue jetson agx orin arm64 architecure ?

can you help me to resolve the problem?

hiennguyen9874 / deepstream-face-recognition

Following document to reproduce result failed #8