PaddlePaddle / PaddleYOLO

🚀🚀🚀 YOLO series of PaddlePaddle implementation, PP-YOLOE+, RT-DETR, YOLOv5, YOLOv6, YOLOv7, YOLOv8, YOLOv10, YOLOX, YOLOv5u, YOLOv7u, YOLOv6Lite, RTMDet and so on. 🚀🚀🚀
https://github.com/PaddlePaddle/PaddleYOLO
GNU General Public License v3.0
552 stars 133 forks source link

Tensorrt export failed #213

Closed karthikbalu closed 2 months ago

karthikbalu commented 8 months ago

Bug组件 Bug Component

Deploy

Bug描述 Describe the Bug

Just follow the official instructions, clone code and below commands and it will fail in the last step

model_name=ppyoloe # yolov7 job_name=ppyoloe_plus_crn_s_80e_coco # yolov7_tiny_300e_coco

config=configs/${model_name}/${job_name}.yml log_dir=log_dir/${job_name}

weights=https://bj.bcebos.com/v1/paddledet/models/${job_name}.pdparams

weights=output/${job_name}/model_final.pdparams

4.export

CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c ${config} -o weights=${weights} # trt=True

CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c ${config} -o weights=${weights} exclude_post_process=True # trt=True

CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c ${config} -o weights=${weights} exclude_nms=True # trt=True

5.deploy infer

CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/${job_name} --image_file=demo/000000014439_640x640.jpg --device=GPU

6.deploy speed, add '--run_mode=trt_fp16' to test in TensorRT FP16 mode

CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/${job_name} --image_file=demo/000000014439_640x640.jpg --device=GPU --run_benchmark=True # --run_mode=trt_fp16

7.export onnx

paddle2onnx --model_dir output_inference/${job_name} --model_filename model.pdmodel --params_filename model.pdiparams --opset_version 12 --save_file ${job_name}.onnx

8.onnx speed

/usr/local/TensorRT-8.0.3.4/bin/trtexec --onnx=${job_name}.onnx --workspace=4096 --avgRuns=10 --shapes=input:1x3x640x640 --fp16

/usr/local/TensorRT-8.0.3.4/bin/trtexec --onnx=${job_name}.onnx --workspace=4096 --avgRuns=10 --shapes=input:1x3x640x640 --fp32

ERROR: [01/26/2024-14:10:42] [W] [TRT] onnx2trt_utils.cpp:364: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. [01/26/2024-14:10:42] [E] [TRT] ModelImporter.cpp:720: While parsing node number 871 [TopK -> "p2o.TopK.1"]: [01/26/2024-14:10:42] [E] [TRT] ModelImporter.cpp:721: --- Begin node --- [01/26/2024-14:10:42] [E] [TRT] ModelImporter.cpp:722: input: "p2o.Gather.5" input: "p2o.ReduceMin.1" output: "p2o.TopK.1" output: "p2o.TopK.2" name: "p2o.TopK.0" op_type: "TopK"

[01/26/2024-14:10:42] [E] [TRT] ModelImporter.cpp:723: --- End node --- [01/26/2024-14:10:42] [E] [TRT] ModelImporter.cpp:726: ERROR: builtin_op_importers.cpp:4292 In function importTopK: [8] Assertion failed: (inputs.at(1).is_weights()) && "This version of TensorRT only supports input K as an initializer." [01/26/2024-14:10:42] [E] Failed to parse onnx file [01/26/2024-14:10:42] [I] Finish parsing network model [01/26/2024-14:10:42] [E] Parsing model failed [01/26/2024-14:10:42] [E] Engine creation failed [01/26/2024-14:10:42] [E] Engine set up failed &&&& FAILED TensorRT.trtexec [TensorRT v8003] # /TRT/TensorRT-8.0.3.4/bin/trtexec --onnx=ppyoloe_plus_crn_s_80e_coco.onnx --workspace=4096 --avgRuns=10 --shapes=input:1x3x640x640 --fp16

For fp16: also this version of TensorRT-8.0.3.4 is very old, atleast cuda 12 should be supported, also i tested with latest tensorrt 8.6 same error ?

Also Unknown option: --fp32

Please provide proper documentation for tensorrt deployment

Thanks

复现环境 Environment

Ubuntu 20.04 Linux Python 3.8.10 Paddlepaddle 2.5.2 cuda11.3.1 CuDNN8.2.1 Tensorrt 8.4.1.5 PaddleDetection: develop branch GCC 9.4.0

Bug描述确认 Bug description confirmation

是否愿意提交PR? Are you willing to submit a PR?

nemonameless commented 8 months ago

Thanks, I will take a look at this issue.

nemonameless commented 2 months ago

please install the newest paddlepaddle