PaddlePaddle / PaddleDetection

Object Detection toolkit based on PaddlePaddle. It supports object detection, instance segmentation, multiple object tracking and real-time multi-person keypoint detection.
Apache License 2.0
12.77k stars 2.88k forks source link

【目标跟踪系统PP-Tracking】ubuntu环境下人体追踪无法使用TensorRT(Segmentation fault (core dumped)错误) #8003

Open qiulongquan opened 1 year ago

qiulongquan commented 1 year ago

问题确认 Search before asking

请提出你的问题 Please ask your question

在ubuntu 运行pp-tracking 的 mot_jde_infer.py 后出现下面错误 运行环境: tensorrt 8.4.0.6 cuDNN Version: 8.2 nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2021 NVIDIA Corporation Built on Fri_Dec_17_18:16:03_PST_2021 Cuda compilation tools, release 11.6, V11.6.55 Build cuda_11.6.r11.6/compiler.30794723_0 tensorRT运行已经验证没有问题,现在运行pp-tracking 出现下面的错误。采用entrance_count_demo.mp4视频进行跟踪测试。 根据日志好像是维度转换错误,请问各位大神 这个是什么问题 ,应该如何解决 谢谢

$ python mot_jde_infer.py /home/seikoist-qiu/anaconda3/envs/fastdeploy/lib/python3.8/site-packages/pkg_resources/init.py:121: DeprecationWarning: pkg_resources is deprecated as an API warnings.warn("pkg_resources is deprecated as an API", DeprecationWarning) /home/seikoist-qiu/anaconda3/envs/fastdeploy/lib/python3.8/site-packages/pkg_resources/init.py:2870: DeprecationWarning: Deprecated call to pkg_resources.declare_namespace('mpl_toolkits'). Implementing implicit namespace packages (as specified in PEP 420) is preferred to pkg_resources.declare_namespace. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages declare_namespace(pkg) /home/seikoist-qiu/anaconda3/envs/fastdeploy/lib/python3.8/site-packages/pkg_resources/init.py:2870: DeprecationWarning: Deprecated call to pkg_resources.declare_namespace('google'). Implementing implicit namespace packages (as specified in PEP 420) is preferred to pkg_resources.declare_namespace. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages declare_namespace(pkg) Warning: Unable to use JDE/FairMOT/ByteTrack, please install lap, for example: pip install lap, see https://github.com/gatagat/lap Warning: Unable to use OC-SORT, please install filterpy, for example: pip install filterpy, see https://github.com/rlabbe/filterpy Warning: Unable to use motmetrics in MTMCT in PP-Tracking, please install motmetrics, for example: pip install motmetrics, see https://github.com/longcw/py-motmetrics Warning: Unable to use MTMCT in PP-Tracking, please install sklearn, for example: pip install sklearn Warning: Unable to use MTMCT in PP-Tracking, please install sklearn, for example: pip install sklearn ----------- Running Arguments ----------- batch_size: 1 camera_id: -1 cpu_threads: 1 device: GPU do_break_in_counting: True do_entrance_counting: False draw_center_traj: True enable_mkldnn: False image_dir: None image_file: None model_dir: /home/seikoist-qiu/PaddleDetection/output_inference/fairmot_hrnetv2_w18_dlafpn_30e_576x320 mtmct_cfg: None mtmct_dir: None output_dir: /home/seikoist-qiu/PaddleDetection/output region_polygon: [780, 330, 1150, 330, 1150, 570, 780, 570] region_type: custom reid_batch_size: 50 reid_model_dir: None run_benchmark: False run_mode: trt_int8 save_images: False save_mot_txt_per_img: False save_mot_txts: False scaled: False secs_interval: 10 skip_frame_num: -1 threshold: 0.5 tracker_config: None trt_calib_mode: True trt_max_shape: 1280 trt_min_shape: 1 trt_opt_shape: 640 use_dark: True use_gpu: True video_file: /home/seikoist-qiu/PaddleDetection/input/entrance_count_demo.mp4

----------- Model Configuration ----------- Model Arch: FairMOT Transform Order: --transform op: LetterBoxResize --transform op: NormalizeImage --transform op: Permute

W0325 14:53:32.254097 68847 analysis_predictor.cc:1118] The one-time configuration of analysis predictor failed, which may be due to native predictor called first and its configurations taken effect. I0325 14:53:32.412621 68847 analysis_predictor.cc:881] TensorRT subgraph engine is enabled --- Running analysis [ir_graph_build_pass] --- Running analysis [ir_graph_clean_pass] --- Running analysis [ir_analysis_pass] --- Running IR pass [adaptive_pool2d_convert_global_pass] --- Running IR pass [shuffle_channel_detect_pass] --- Running IR pass [quant_conv2d_dequant_fuse_pass] --- Running IR pass [delete_quant_dequant_op_pass] --- Running IR pass [delete_quant_dequant_filter_op_pass] --- Running IR pass [delete_weight_dequant_linear_op_pass] --- Running IR pass [delete_quant_dequant_linear_op_pass] --- Running IR pass [add_support_int8_pass] I0325 14:53:46.017089 68847 fuse_pass_base.cc:57] --- detected 1263 subgraphs --- Running IR pass [simplify_with_basic_ops_pass] --- Running IR pass [embedding_eltwise_layernorm_fuse_pass] --- Running IR pass [preln_embedding_eltwise_layernorm_fuse_pass] --- Running IR pass [multihead_matmul_fuse_pass_v2] --- Running IR pass [multihead_matmul_fuse_pass_v3] --- Running IR pass [skip_layernorm_fuse_pass] --- Running IR pass [preln_skip_layernorm_fuse_pass] --- Running IR pass [unsqueeze2_eltwise_fuse_pass] --- Running IR pass [trt_squeeze2_matmul_fuse_pass] --- Running IR pass [trt_reshape2_matmul_fuse_pass] --- Running IR pass [trt_flatten2_matmul_fuse_pass] --- Running IR pass [trt_map_matmul_v2_to_mul_pass] --- Running IR pass [trt_map_matmul_v2_to_matmul_pass] --- Running IR pass [trt_map_matmul_to_mul_pass] --- Running IR pass [fc_fuse_pass] --- Running IR pass [conv_elementwise_add_fuse_pass] I0325 14:53:46.408970 68847 fuse_pass_base.cc:57] --- detected 8 subgraphs --- Running IR pass [tensorrt_subgraph_pass] I0325 14:53:47.562645 68847 tensorrt_subgraph_pass.cc:145] --- detect a sub-graph with 1127 nodes W0325 14:53:47.755862 68847 tensorrt_subgraph_pass.cc:374] The Paddle Inference library is compiled with 8 version TensorRT, but the runtime TensorRT you are using is 8.4 version. This might cause serious compatibility issues. We strongly recommend using the same TRT version at runtime. I0325 14:53:47.845490 68847 tensorrt_subgraph_pass.cc:145] --- detect a sub-graph with 32 nodes I0325 14:53:47.855068 68847 tensorrt_subgraph_pass.cc:145] --- detect a sub-graph with 25 nodes --- Running IR pass [conv_elementwise_add_act_fuse_pass] --- Running IR pass [conv_elementwise_add2_act_fuse_pass] --- Running IR pass [transpose_flatten_concat_fuse_pass] --- Running analysis [ir_params_sync_among_devices_pass] I0325 14:53:47.921360 68847 ir_params_sync_among_devices_pass.cc:100] Sync params from CPU to GPU --- Running analysis [adjust_cudnn_workspace_size_pass] --- Running analysis [inference_op_replace_pass] --- Running analysis [memory_optimize_pass] I0325 14:53:48.026532 68847 memory_optimize_pass.cc:216] Cluster name : fill_constant_3.tmp_0 size: 8 I0325 14:53:48.026553 68847 memory_optimize_pass.cc:216] Cluster name : im_shape size: 8 I0325 14:53:48.026556 68847 memory_optimize_pass.cc:216] Cluster name : tmp_5 size: 4000 I0325 14:53:48.026559 68847 memory_optimize_pass.cc:216] Cluster name : top_k_v2_1.tmp_1 size: 4000 I0325 14:53:48.026563 68847 memory_optimize_pass.cc:216] Cluster name : reshape2_5.tmp_1 size: 0 I0325 14:53:48.026566 68847 memory_optimize_pass.cc:216] Cluster name : scale_factor size: 8 I0325 14:53:48.026569 68847 memory_optimize_pass.cc:216] Cluster name : gather_0.tmp_0 size: 4000 I0325 14:53:48.026572 68847 memory_optimize_pass.cc:216] Cluster name : elementwise_div_1 size: 5898240 I0325 14:53:48.026576 68847 memory_optimize_pass.cc:216] Cluster name : transpose_2.tmp_0 size: 5898240 --- Running analysis [ir_graph_to_program_pass] I0325 14:53:48.749737 68847 analysis_predictor.cc:1035] ======= optimize end ======= I0325 14:53:48.794737 68847 naive_executor.cc:102] --- skip [feed], feed -> scale_factor I0325 14:53:48.794764 68847 naive_executor.cc:102] --- skip [feed], feed -> image I0325 14:53:48.794770 68847 naive_executor.cc:102] --- skip [feed], feed -> im_shape I0325 14:53:48.802505 68847 naive_executor.cc:102] --- skip [concat_2.tmp_0], fetch -> fetch I0325 14:53:48.802528 68847 naive_executor.cc:102] --- skip [gather_5.tmp_0], fetch -> fetch fps: 30, frame_count: 149 Tracking frame: 0 W0325 14:53:48.862004 68847 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.5, Driver API Version: 11.6, Runtime API Version: 11.2 W0325 14:53:48.864483 68847 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2. I0325 14:53:48.869413 68847 tensorrt_engine_op.h:422] This process is generating calibration table for Paddle TRT int8... I0325 14:53:48.869959 69085 tensorrt_engine_op.h:294] Prepare TRT engine (Optimize model structure, Select OP kernel etc). This process may cost a lot of time. W0325 14:53:52.768520 69085 helper.h:107] TensorRT was linked against cuDNN 8.3.2 but loaded cuDNN 8.2.1 W0325 14:53:57.451269 69085 helper.h:107] TensorRT was linked against cuDNN 8.3.2 but loaded cuDNN 8.2.1 W0325 14:53:57.460517 69085 helper.h:107] TensorRT was linked against cuDNN 8.3.2 but loaded cuDNN 8.2.1 I0325 14:53:57.684039 69107 tensorrt_engine_op.h:294] Prepare TRT engine (Optimize model structure, Select OP kernel etc). This process may cost a lot of time. E0325 14:53:57.740598 69107 helper.h:111] 3: batchnorm_add_scale (Output: batch_norm_318.tmp_25158):shift weights has count 18 but 1 was expected E0325 14:53:57.740620 69107 helper.h:111] 3: batchnorm_add_scale (Output: batch_norm_318.tmp_25158):shift weights has count 18 but 1 was expected E0325 14:53:57.797207 69107 helper.h:111] 4: [graphShapeAnalyzer.cpp::analyzeShapes::1300] Error Code 4: Miscellaneous (IShuffleLayer (Unnamed Layer 14) [Shuffle]: reshape changes volume. Reshaping [18,18,80,144] to [1,1,1].) E0325 14:53:57.852206 69107 helper.h:111] 4: [graphShapeAnalyzer.cpp::analyzeShapes::1300] Error Code 4: Miscellaneous (IShuffleLayer (Unnamed Layer 14) [Shuffle]: reshape changes volume. Reshaping [18,18,80,144] to [1,1,1].) E0325 14:53:57.908993 69107 helper.h:111] 4: [graphShapeAnalyzer.cpp::analyzeShapes::1300] Error Code 4: Miscellaneous (IShuffleLayer (Unnamed Layer 14) [Shuffle]: reshape changes volume. Reshaping [18,18,80,144] to [1,1,1].) E0325 14:53:57.963779 69107 helper.h:111] 4: [graphShapeAnalyzer.cpp::analyzeShapes::1300] Error Code 4: Miscellaneous (IShuffleLayer (Unnamed Layer 14) [Shuffle]: reshape changes volume. Reshaping [18,18,80,144] to [1,1,1].) E0325 14:53:58.018690 69107 helper.h:111] 4: [graphShapeAnalyzer.cpp::analyzeShapes::1300] Error Code 4: Miscellaneous (IShuffleLayer (Unnamed Layer 14) [Shuffle]: reshape changes volume. Reshaping [18,18,80,144] to [1,1,1].) W0325 14:53:58.021021 69107 helper.h:107] Unused Input: relu_279.tmp_0_clone_0 E0325 14:53:58.075899 69107 helper.h:111] 4: [graphShapeAnalyzer.cpp::analyzeShapes::1300] Error Code 4: Miscellaneous (IShuffleLayer (Unnamed Layer 14) [Shuffle]: reshape changes volume. Reshaping [18,18,80,144] to [1,1,1].) E0325 14:53:58.098047 69107 helper.h:111] 4: [network.cpp::validate::2927] Error Code 4: Internal Error (Could not compute dimensions for conv2d_648.tmp_05163, because the network is not valid.) E0325 14:53:58.108924 69107 helper.h:111] 2: [builder.cpp::buildSerializedNetwork::619] Error Code 2: Internal Error (Assertion engine != nullptr failed. )

C++ Traceback (most recent call last):

0 std:🧵:_State_impl<std:🧵:_Invoker<std::tuple<paddle::operators::TensorRTEngineOp::RunCalibration(paddle::framework::Scope const&, phi::Place const&, paddle::inference::tensorrt::TensorRTEngine) const::{lambda()https://github.com/PaddlePaddle/PaddleDetection/pull/1}> > >::_M_run() 1 paddle::operators::TensorRTEngineOp::PrepareTRTEngine(paddle::framework::Scope const&, paddle::inference::tensorrt::TensorRTEngine) const 2 paddle::inference::tensorrt::OpConverter::ConvertBlockToTRTEngine(paddle::framework::BlockDesc, paddle::framework::Scope const&, std::vector<std::string, std::allocator > const&, std::unordered_set<std::string, std::hash, std::equal_to, std::allocator > const&, std::vector<std::string, std::allocator > const&, paddle::inference::tensorrt::TensorRTEngine) 3 paddle::inference::tensorrt::TensorRTEngine::FreezeNetwork()

Error Message Summary:

FatalError: Segmentation fault is detected by the operating system. [TimeInfo: Aborted at 1679756038 (unix time) try "date -d @1679756038" if you are using GNU date ] [SignalInfo: SIGSEGV (@0x8) received by PID 68847 (TID 0x7fe475cb2700) from PID 8 ]

Segmentation fault (core dumped)

nemonameless commented 1 year ago

请发下完整的运行命令谢谢

qiulongquan commented 1 year ago

请发下完整的运行命令谢谢

您好,运行命令就是下面的命令 python mot_jde_infer.py 参数都定义在了mot_utils.py文件的argsparser函数里面,我提取了所有的参数定义在下面:

    "--model_dir",
    type=str,
    default='/home/seikoist-qiu/PaddleDetection/output_inference/fairmot_hrnetv2_w18_dlafpn_30e_576x320',

    "--image_file", 
    type=str, 
    default=None, 

    "--image_dir",
    type=str,
    default=None,

    "--batch_size", type=int, default=1, help="batch_size for inference.")

    "--video_file",
    type=str,
    default='/home/seikoist-qiu/PaddleDetection/input/entrance_count_demo.mp4',

    "--camera_id",
    type=int,
    default=-1,

    "--threshold", 
    type=float, 
    default=0.5, 

    "--output_dir",
    type=str,
    default="/home/seikoist-qiu/PaddleDetection/output",

    "--run_mode",
    type=str,
    default='trt_int8',

    "--device",
    type=str,
    default='GPU',

    "--use_gpu",
    type=ast.literal_eval,
    default=True,

    "--run_benchmark",
    type=ast.literal_eval,
    default=False,

    "--enable_mkldnn",
    type=ast.literal_eval,
    default=False,

    "--cpu_threads", 
    type=int, 
    default=1, 

    "--trt_min_shape", 
    type=int,
    default=1, 

    "--trt_max_shape",
    type=int,
    default=1280,

    "--trt_opt_shape",
    type=int,
    default=640,

    "--trt_calib_mode",
    type=bool,
    default=True,

    '--save_images',
    action='store_true',
    default=False,

    '--save_mot_txts',
    action='store_true',

    '--save_mot_txt_per_img',
    action='store_true',

    '--scaled',
    type=bool,
    default=False,

    "--tracker_config", 
    type=str,
    default=None, 

    "--reid_model_dir",
    type=str,
    default=None,

    "--reid_batch_size",
    type=int,
    default=50,

    '--use_dark',
    type=ast.literal_eval,
    default=True,

    '--skip_frame_num',
    type=int,
    default=-1,

    "--do_entrance_counting",
    action='store_true',
    default=False,

    "--do_break_in_counting",
    action='store_true',
    default=True,

    "--region_type",
    type=str,
    default='custom',

    '--region_polygon',
    nargs='+',
    type=int,
    default=[780,330,1150,330,1150,570,780,570],

    "--secs_interval",
    type=int,
    default=10,

    "--draw_center_traj",
    action='store_true',
    default=True,

    "--mtmct_dir",
    type=str,
    default=None,

    "--mtmct_cfg", 
    type=str, 
    default=None, 
qiulongquan commented 1 year ago

哪位大神 可以给我解答一下 问题吗,没有人可以帮助我们吗? 拜托了

nemonameless commented 1 year ago

run_mode使用 trt_fp16 或默认的paddle

qiulongquan commented 1 year ago

run_mode使用 trt_fp16 或默认的paddle 我修改了run_mode=trt_fp16 但是还是有下面的错误,我调整use_calib_mode=False 但是结果一样,没有改变。 错误提示 ValueError: (InvalidArgument) The input [cast_1.tmp_0] shape of trt subgraph is [500].it's not supported by trt so far 好像不支持trt啊?


$ python mot_jde_infer.py 
/home/seikoist-qiu/anaconda3/envs/fastdeploy/lib/python3.8/site-packages/pkg_resources/__init__.py:121: DeprecationWarning: pkg_resources is deprecated as an API
warnings.warn("pkg_resources is deprecated as an API", DeprecationWarning)
/home/seikoist-qiu/anaconda3/envs/fastdeploy/lib/python3.8/site-packages/pkg_resources/__init__.py:2870: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('mpl_toolkits')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
declare_namespace(pkg)
/home/seikoist-qiu/anaconda3/envs/fastdeploy/lib/python3.8/site-packages/pkg_resources/__init__.py:2870: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('google')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
declare_namespace(pkg)
Warning: Unable to use JDE/FairMOT/ByteTrack, please install lap, for example: `pip install lap`, see https://github.com/gatagat/lap
Warning: Unable to use OC-SORT, please install filterpy, for example: `pip install filterpy`, see https://github.com/rlabbe/filterpy
Warning: Unable to use motmetrics in MTMCT in PP-Tracking, please install motmetrics, for example: `pip install motmetrics`, see https://github.com/longcw/py-motmetrics
Warning: Unable to use MTMCT in PP-Tracking, please install sklearn, for example: `pip install sklearn`
Warning: Unable to use MTMCT in PP-Tracking, please install sklearn, for example: `pip install sklearn`
-----------  Running Arguments -----------
batch_size: 1
camera_id: -1
cpu_threads: 1
device: GPU
do_break_in_counting: True
do_entrance_counting: False
draw_center_traj: True
enable_mkldnn: False
image_dir: None
image_file: None
model_dir: /home/seikoist-qiu/PaddleDetection/output_inference/fairmot_hrnetv2_w18_dlafpn_30e_576x320
mtmct_cfg: None
mtmct_dir: None
output_dir: /home/seikoist-qiu/PaddleDetection/output
region_polygon: [780, 330, 1150, 330, 1150, 570, 780, 570]
region_type: custom
reid_batch_size: 50
reid_model_dir: None
run_benchmark: False
run_mode: trt_fp16
save_images: False
save_mot_txt_per_img: False
save_mot_txts: False
scaled: False
secs_interval: 10
skip_frame_num: -1
threshold: 0.5
tracker_config: None
trt_calib_mode: True
trt_max_shape: 1280
trt_min_shape: 1
trt_opt_shape: 640
use_dark: True
use_gpu: True
video_file: /home/seikoist-qiu/PaddleDetection/input/entrance_count_demo.mp4
------------------------------------------
-----------  Model Configuration -----------
Model Arch: FairMOT
Transform Order: 
--transform op: LetterBoxResize
--transform op: NormalizeImage
--transform op: Permute
--------------------------------------------
W0402 09:55:20.582756 53089 analysis_predictor.cc:1118] The one-time configuration of analysis predictor failed, which may be due to native predictor called first and its configurations taken effect.
I0402 09:55:20.792492 53089 analysis_predictor.cc:881] TensorRT subgraph engine is enabled
--- Running analysis [ir_graph_build_pass]
--- Running analysis [ir_graph_clean_pass]
--- Running analysis [ir_analysis_pass]
--- Running IR pass [adaptive_pool2d_convert_global_pass]
--- Running IR pass [shuffle_channel_detect_pass]
--- Running IR pass [quant_conv2d_dequant_fuse_pass]
--- Running IR pass [delete_quant_dequant_op_pass]
--- Running IR pass [delete_quant_dequant_filter_op_pass]
--- Running IR pass [delete_weight_dequant_linear_op_pass]
--- Running IR pass [delete_quant_dequant_linear_op_pass]
--- Running IR pass [add_support_int8_pass]
I0402 09:55:34.062013 53089 fuse_pass_base.cc:57] ---  detected 1263 subgraphs
--- Running IR pass [simplify_with_basic_ops_pass]
--- Running IR pass [embedding_eltwise_layernorm_fuse_pass]
--- Running IR pass [preln_embedding_eltwise_layernorm_fuse_pass]
--- Running IR pass [multihead_matmul_fuse_pass_v2]
--- Running IR pass [multihead_matmul_fuse_pass_v3]
--- Running IR pass [skip_layernorm_fuse_pass]
--- Running IR pass [preln_skip_layernorm_fuse_pass]
--- Running IR pass [conv_bn_fuse_pass]
I0402 09:55:36.212893 53089 fuse_pass_base.cc:57] ---  detected 321 subgraphs
--- Running IR pass [unsqueeze2_eltwise_fuse_pass]
--- Running IR pass [trt_squeeze2_matmul_fuse_pass]
--- Running IR pass [trt_reshape2_matmul_fuse_pass]
--- Running IR pass [trt_flatten2_matmul_fuse_pass]
--- Running IR pass [trt_map_matmul_v2_to_mul_pass]
--- Running IR pass [trt_map_matmul_v2_to_matmul_pass]
--- Running IR pass [trt_map_matmul_to_mul_pass]
--- Running IR pass [fc_fuse_pass]
--- Running IR pass [conv_elementwise_add_fuse_pass]
I0402 09:55:37.232223 53089 fuse_pass_base.cc:57] ---  detected 329 subgraphs
--- Running IR pass [tensorrt_subgraph_pass]
I0402 09:55:37.400718 53089 tensorrt_subgraph_pass.cc:145] ---  detect a sub-graph with 28 nodes
W0402 09:55:37.404436 53089 tensorrt_subgraph_pass.cc:374] The Paddle Inference library is compiled with 8 version TensorRT, but the runtime TensorRT you are using is 8.4 version. This might cause serious compatibility issues. We strongly recommend using the same TRT version at runtime.
I0402 09:55:37.409229 53089 tensorrt_subgraph_pass.cc:433] Prepare TRT engine (Optimize model structure, Select OP kernel etc). This process may cost a lot of time.
I0402 09:55:38.716629 53089 engine.cc:88] Run Paddle-TRT FP16 mode
W0402 09:55:39.998966 53089 helper.h:107] TensorRT was linked against cuDNN 8.3.2 but loaded cuDNN 8.2.1
W0402 09:55:57.698086 53089 helper.h:107] TensorRT was linked against cuDNN 8.3.2 but loaded cuDNN 8.2.1
W0402 09:55:57.736516 53089 helper.h:107] TensorRT was linked against cuDNN 8.3.2 but loaded cuDNN 8.2.1
I0402 09:55:57.737035 53089 engine.cc:471] Inspector needs TensorRT version 8.2 and after.
I0402 09:55:57.742816 53089 tensorrt_subgraph_pass.cc:145] ---  detect a sub-graph with 25 nodes
I0402 09:55:57.747798 53089 tensorrt_subgraph_pass.cc:433] Prepare TRT engine (Optimize model structure, Select OP kernel etc). This process may cost a lot of time.
Traceback (most recent call last):
File "mot_jde_infer.py", line 509, in <module>
main()
File "mot_jde_infer.py", line 456, in main
detector = JDE_Detector(
File "mot_jde_infer.py", line 102, in __init__
super(JDE_Detector, self).__init__(
File "/home/seikoist-qiu/PaddleDetection/deploy/pptracking/python/det_infer.py", line 102, in __init__
self.predictor, self.config = load_predictor(
File "/home/seikoist-qiu/PaddleDetection/deploy/pptracking/python/det_infer.py", line 485, in load_predictor
predictor = create_predictor(config)
ValueError: (InvalidArgument) The input [cast_1.tmp_0] shape of trt subgraph is [500].it's not supported by trt so far
[Hint: Expected shape.size() != 1UL, but received shape.size():1 == 1UL:1.] (at /paddle/paddle/fluid/inference/tensorrt/engine.h:141)

(fastdeploy) seikoist-qiu@rnd-gs-1:~/PaddleDetection/deploy/pptracking/python$ python mot_jde_infer.py /home/seikoist-qiu/anaconda3/envs/fastdeploy/lib/python3.8/site-packages/pkg_resources/init.py:121: DeprecationWarning: pkg_resources is deprecated as an API warnings.warn("pkg_resources is deprecated as an API", DeprecationWarning) /home/seikoist-qiu/anaconda3/envs/fastdeploy/lib/python3.8/site-packages/pkg_resources/init.py:2870: DeprecationWarning: Deprecated call to pkg_resources.declare_namespace('mpl_toolkits'). Implementing implicit namespace packages (as specified in PEP 420) is preferred to pkg_resources.declare_namespace. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages declare_namespace(pkg) /home/seikoist-qiu/anaconda3/envs/fastdeploy/lib/python3.8/site-packages/pkg_resources/init.py:2870: DeprecationWarning: Deprecated call to pkg_resources.declare_namespace('google'). Implementing implicit namespace packages (as specified in PEP 420) is preferred to pkg_resources.declare_namespace. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages declare_namespace(pkg) Warning: Unable to use JDE/FairMOT/ByteTrack, please install lap, for example: pip install lap, see https://github.com/gatagat/lap Warning: Unable to use OC-SORT, please install filterpy, for example: pip install filterpy, see https://github.com/rlabbe/filterpy Warning: Unable to use motmetrics in MTMCT in PP-Tracking, please install motmetrics, for example: pip install motmetrics, see https://github.com/longcw/py-motmetrics Warning: Unable to use MTMCT in PP-Tracking, please install sklearn, for example: pip install sklearn Warning: Unable to use MTMCT in PP-Tracking, please install sklearn, for example: pip install sklearn ----------- Running Arguments ----------- batch_size: 1 camera_id: -1 cpu_threads: 1 device: GPU do_break_in_counting: True do_entrance_counting: False draw_center_traj: True enable_mkldnn: False image_dir: None image_file: None model_dir: /home/seikoist-qiu/PaddleDetection/output_inference/fairmot_hrnetv2_w18_dlafpn_30e_576x320 mtmct_cfg: None mtmct_dir: None output_dir: /home/seikoist-qiu/PaddleDetection/output region_polygon: [780, 330, 1150, 330, 1150, 570, 780, 570] region_type: custom reid_batch_size: 50 reid_model_dir: None run_benchmark: False run_mode: trt_fp16 save_images: False save_mot_txt_per_img: False save_mot_txts: False scaled: False secs_interval: 10 skip_frame_num: -1 threshold: 0.5 tracker_config: None trt_calib_mode: True trt_max_shape: 1280 trt_min_shape: 1 trt_opt_shape: 640 use_dark: True use_gpu: True video_file: /home/seikoist-qiu/PaddleDetection/input/entrance_count_demo.mp4

----------- Model Configuration ----------- Model Arch: FairMOT Transform Order: --transform op: LetterBoxResize --transform op: NormalizeImage --transform op: Permute

W0402 09:58:31.406917 53374 analysis_predictor.cc:1118] The one-time configuration of analysis predictor failed, which may be due to native predictor called first and its configurations taken effect. I0402 09:58:31.570144 53374 analysis_predictor.cc:881] TensorRT subgraph engine is enabled --- Running analysis [ir_graph_build_pass] --- Running analysis [ir_graph_clean_pass] --- Running analysis [ir_analysis_pass] --- Running IR pass [adaptive_pool2d_convert_global_pass] --- Running IR pass [shuffle_channel_detect_pass] --- Running IR pass [quant_conv2d_dequant_fuse_pass] --- Running IR pass [delete_quant_dequant_op_pass] --- Running IR pass [delete_quant_dequant_filter_op_pass] --- Running IR pass [delete_weight_dequant_linear_op_pass] --- Running IR pass [delete_quant_dequant_linear_op_pass] --- Running IR pass [add_support_int8_pass] I0402 09:58:44.972270 53374 fuse_pass_base.cc:57] --- detected 1263 subgraphs --- Running IR pass [simplify_with_basic_ops_pass] --- Running IR pass [embedding_eltwise_layernorm_fuse_pass] --- Running IR pass [preln_embedding_eltwise_layernorm_fuse_pass] --- Running IR pass [multihead_matmul_fuse_pass_v2] --- Running IR pass [multihead_matmul_fuse_pass_v3] --- Running IR pass [skip_layernorm_fuse_pass] --- Running IR pass [preln_skip_layernorm_fuse_pass] --- Running IR pass [conv_bn_fuse_pass] I0402 09:58:47.123872 53374 fuse_pass_base.cc:57] --- detected 321 subgraphs --- Running IR pass [unsqueeze2_eltwise_fuse_pass] --- Running IR pass [trt_squeeze2_matmul_fuse_pass] --- Running IR pass [trt_reshape2_matmul_fuse_pass] --- Running IR pass [trt_flatten2_matmul_fuse_pass] --- Running IR pass [trt_map_matmul_v2_to_mul_pass] --- Running IR pass [trt_map_matmul_v2_to_matmul_pass] --- Running IR pass [trt_map_matmul_to_mul_pass] --- Running IR pass [fc_fuse_pass] --- Running IR pass [conv_elementwise_add_fuse_pass] I0402 09:58:48.156301 53374 fuse_pass_base.cc:57] --- detected 329 subgraphs --- Running IR pass [tensorrt_subgraph_pass] I0402 09:58:48.325668 53374 tensorrt_subgraph_pass.cc:145] --- detect a sub-graph with 28 nodes W0402 09:58:48.329381 53374 tensorrt_subgraph_pass.cc:374] The Paddle Inference library is compiled with 8 version TensorRT, but the runtime TensorRT you are using is 8.4 version. This might cause serious compatibility issues. We strongly recommend using the same TRT version at runtime. I0402 09:58:48.334941 53374 tensorrt_subgraph_pass.cc:433] Prepare TRT engine (Optimize model structure, Select OP kernel etc). This process may cost a lot of time. I0402 09:58:49.651007 53374 engine.cc:88] Run Paddle-TRT FP16 mode W0402 09:58:50.951575 53374 helper.h:107] TensorRT was linked against cuDNN 8.3.2 but loaded cuDNN 8.2.1 W0402 09:59:09.319727 53374 helper.h:107] TensorRT was linked against cuDNN 8.3.2 but loaded cuDNN 8.2.1 W0402 09:59:09.357450 53374 helper.h:107] TensorRT was linked against cuDNN 8.3.2 but loaded cuDNN 8.2.1 I0402 09:59:09.357983 53374 engine.cc:471] Inspector needs TensorRT version 8.2 and after. I0402 09:59:09.363739 53374 tensorrt_subgraph_pass.cc:145] --- detect a sub-graph with 810 nodes I0402 09:59:09.487529 53374 tensorrt_subgraph_pass.cc:433] Prepare TRT engine (Optimize model structure, Select OP kernel etc). This process may cost a lot of time. I0402 09:59:09.716889 53374 engine.cc:88] Run Paddle-TRT FP16 mode W0402 09:59:15.313987 53374 helper.h:107] TensorRT was linked against cuDNN 8.3.2 but loaded cuDNN 8.2.1 W0402 10:00:45.822618 53374 helper.h:107] TensorRT was linked against cuDNN 8.3.2 but loaded cuDNN 8.2.1 W0402 10:00:46.870285 53374 helper.h:107] TensorRT was linked against cuDNN 8.3.2 but loaded cuDNN 8.2.1 I0402 10:00:46.872254 53374 engine.cc:471] Inspector needs TensorRT version 8.2 and after. I0402 10:00:46.970198 53374 tensorrt_subgraph_pass.cc:145] --- detect a sub-graph with 25 nodes I0402 10:00:46.999508 53374 tensorrt_subgraph_pass.cc:433] Prepare TRT engine (Optimize model structure, Select OP kernel etc). This process may cost a lot of time. Traceback (most recent call last): File "mot_jde_infer.py", line 509, in main() File "mot_jde_infer.py", line 456, in main detector = JDE_Detector( File "mot_jde_infer.py", line 102, in init super(JDE_Detector, self).init( File "/home/seikoist-qiu/PaddleDetection/deploy/pptracking/python/det_infer.py", line 112, in init self.predictor, self.config = load_predictor( File "/home/seikoist-qiu/PaddleDetection/deploy/pptracking/python/det_infer.py", line 492, in load_predictor predictor = create_predictor(config) ValueError: (InvalidArgument) The input [cast_1.tmp_0] shape of trt subgraph is [500].it's not supported by trt so far [Hint: Expected shape.size() != 1UL, but received shape.size():1 == 1UL:1.] (at /paddle/paddle/fluid/inference/tensorrt/engine.h:141)

qiulongquan commented 1 year ago

请问,对于这个问题 能有进一步的指导吗 各位大神们

nemonameless commented 1 year ago

建议你换模型换方案。 首先fairmot精度已经不算高了,开放的权重都是MOT17这种有点过时的小数据集上训的,泛化性不够好,你要自己新训业务数据集也需要reid gt标注。 建议直接使用bytetrack ocsort botsort之类的跟踪方案,也不用reid,就当纯检测器就行,PP-Human PP-Vehicle也都已经有了更大规模数据集训好的权重,泛化性更高。https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/deploy/pipeline

参照这个教程 https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/deploy/pipeline/docs/tutorials/pphuman_mot.md

python deploy/pipeline/pipeline.py --config deploy/pipeline/config/infer_cfg_pphuman.yml \
                                                   -o MOT.model_dir=ppyoloe/\
                                                   --video_file=test_video.mp4 \
                                                   --device=gpu \
                                                   --region_type=horizontal \
                                                   --do_entrance_counting \
                                                   --draw_center_traj

或者

python deploy/pptracking/python/mot_sde_infer.py --model_dir=output_inference/mot_ppyoloe_l_36e_pipeline --tracker_config=deploy/pptracking/python/tracker_config.yml --video_file=entrance_count_demo.mp4 --device=GPU --save_mot_txts --do_entrance_counting --draw_center_traj --run_mode=trt_fp16

do_entrance_counting是通用的,你换mot_sde_infer.py调整下运行命令也都行。代码全局搜索 do_entrance_counting ,就能看到别的地方或方案是怎么用的。

qiulongquan commented 1 year ago

建议你换模型换方案。 首先fairmot精度已经不算高了,开放的权重都是MOT17这种有点过时的小数据集上训的,泛化性不够好,你要自己新训业务数据集也需要reid gt标注。 建议直接使用bytetrack ocsort botsort之类的跟踪方案,也不用reid,就当纯检测器就行,PP-Human PP-Vehicle也都已经有了更大规模数据集训好的权重,泛化性更高。https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/deploy/pipeline

参照这个教程 https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/deploy/pipeline/docs/tutorials/pphuman_mot.md

python deploy/pipeline/pipeline.py --config deploy/pipeline/config/infer_cfg_pphuman.yml \
                                                   -o MOT.model_dir=ppyoloe/\
                                                   --video_file=test_video.mp4 \
                                                   --device=gpu \
                                                   --region_type=horizontal \
                                                   --do_entrance_counting \
                                                   --draw_center_traj

或者

python deploy/pptracking/python/mot_sde_infer.py --model_dir=output_inference/mot_ppyoloe_l_36e_pipeline --tracker_config=deploy/pptracking/python/tracker_config.yml --video_file=entrance_count_demo.mp4 --device=GPU --save_mot_txts --do_entrance_counting --draw_center_traj --run_mode=trt_fp16

do_entrance_counting是通用的,你换mot_sde_infer.py调整下运行命令也都行。代码全局搜索 do_entrance_counting ,就能看到别的地方或方案是怎么用的。

好的 谢谢你的解释 我试一下 bytetrack 这样的跟踪方法试一下效果 FairMOT是一个模型获得bbox和特征提取 bytetrack ocsort botsort之类的跟踪方案只有bbox输出 直接采用跟踪方法进行track id更新不进行特征提取,所以速度更快。 我这个理解正确吗?

qiulongquan commented 1 year ago

请问大神 我上面的理解正确吗?@nemonameless