FastDeploy-release-1.0.7 单个模型在70ms作用，piplin需要300-500毫秒

13fengfeng commented 5 months ago

版本FastDeploy-release-1.0.7/examples/ 运行的示例：examples/vision/keypointdetection [INFO] fastdeploy/vision/common/processors/transform.cc(45)::FuseNormalizeCast Normalize and Cast are fused to Normalize in [FastDeploy] PPDet in RKNPU2 duration = 0.443722s. 硬件 rk3588模型是 rknn tinypose_256x192 picodet_s_416_coco 单个模型在70ms作用，piplin需要300-500毫秒

13fengfeng commented 5 months ago

代码如下

void np2Infernp2Infer(const std::string& det_model_dir,
              const std::string& tinypose_model_dir,
              const std::string& image_file) {
  auto option = fastdeploy::RuntimeOption();
  option.UseRKNPU2();
  auto format = fastdeploy::ModelFormat::RKNN;
  auto det_model_file = det_model_dir + sep + "picodet_s_416_coco_lcnet_rk3588_unquantized.rknn";
  auto det_params_file ="";
  auto det_config_file = det_model_dir + sep + "infer_cfg_det.yml";

  auto det_model = fastdeploy::vision::detection::PicoDet(
      det_model_file, det_params_file, det_config_file, option,format);
  if (!det_model.Initialized()) {
    std::cerr << "Detection Model Failed to initialize." << std::endl;
    return;
  }
  det_model.GetPreprocessor().DisablePermute();
  det_model.GetPreprocessor().DisableNormalize();
  det_model.GetPostprocessor().ApplyNMS();

  auto tinypose_model_file = tinypose_model_dir + sep + "tinypose_256x192_shape_rk3588_unquantized.rknn";
  auto tinypose_params_file = "";
  auto tinypose_config_file = tinypose_model_dir + sep + "infer_cfg_tiny.yml";
  auto tinypose_model = fastdeploy::vision::keypointdetection::PPTinyPose(
      tinypose_model_file, tinypose_params_file, tinypose_config_file, option,format);
  if (!tinypose_model.Initialized()) {
    std::cerr << "TinyPose Model Failed to initialize." << std::endl;
    return;
  }

  tinypose_model.DisablePermute();
  tinypose_model.DisableNormalize();

  auto im = cv::imread(image_file);
  fastdeploy::vision::KeyPointDetectionResult res;

  fastdeploy::TimeCounter tc;
  auto pipeline =
      fastdeploy::pipeline::PPTinyPose(
          &det_model, &tinypose_model);
  pipeline.detection_model_score_threshold = 0.5;
   tc.Start();
  if (!pipeline.Predict(&im, &res)) {
    std::cerr << "TinyPose Prediction Failed." << std::endl;
    return;
  } else {
    std::cout << "TinyPose Prediction Done!" << std::endl;
  }
   tc.End();
  tc.PrintInfo("PPDet in RKNPU2");
  std::cout << res.Str() << std::endl;

  auto vis_im =
      fastdeploy::vision::VisKeypointDetection(im, res, 0.2);
  cv::imwrite("vis_result.jpg", vis_im);
  std::cout << "TinyPose visualized result saved in ./vis_result.jpg"
            << std::endl;
}

13fengfeng commented 5 months ago

完整的输出日志：

 [INFO] fastdeploy/vision/common/processors/transform.cc(45)::FuseNormalizeCast  Normalize and Cast are fused to Normalize in preprocessing pipeline.
[INFO] fastdeploy/vision/common/processors/transform.cc(93)::FuseNormalizeHWC2CHW       Normalize and HWC2CHW are fused to NormalizeAndPermute  in preprocessing pipeline.
[INFO] fastdeploy/vision/common/processors/transform.cc(159)::FuseNormalizeColorConvert BGR2RGB and NormalizeAndPermute are fused to NormalizeAndPermute with swap_rb=1
[INFO] fastdeploy/runtime/backends/rknpu2/rknpu2_backend.cc(81)::GetSDKAndDeviceVersion rknpu2 runtime version: 2.0.0b0 (35a6907d79@2024-03-24T10:31:14)
[INFO] fastdeploy/runtime/backends/rknpu2/rknpu2_backend.cc(82)::GetSDKAndDeviceVersion rknpu2 driver version: 0.7.2
index=0, name=image, n_dims=4, dims=[1, 416, 416, 3], n_elems=519168, size=1038336, fmt=NHWC, type=FP16, qnt_type=AFFINE, zp=0, scale=1.000000, pass_through=0
index=0, name=p2o.Mul.23, n_dims=3, dims=[1, 3598, 4, 0], n_elems=14392, size=28784, fmt=UNDEFINED, type=FP32, qnt_type=AFFINE, zp=0, scale=1.000000, pass_through=0
index=1, name=p2o.Concat.17, n_dims=3, dims=[1, 80, 3598, 0], n_elems=287840, size=575680, fmt=UNDEFINED, type=FP32, qnt_type=AFFINE, zp=0, scale=1.000000, pass_through=0
[INFO] fastdeploy/runtime/runtime.cc(341)::CreateRKNPU2Backend  Runtime initialized with Backend::RKNPU2 in Device::RKNPU.
[INFO] fastdeploy/vision/common/processors/transform.cc(159)::FuseNormalizeColorConvert BGR2RGB and Normalize are fused to Normalize with swap_rb=1
[INFO] fastdeploy/runtime/backends/rknpu2/rknpu2_backend.cc(81)::GetSDKAndDeviceVersion rknpu2 runtime version: 2.0.0b0 (35a6907d79@2024-03-24T10:31:14)
[INFO] fastdeploy/runtime/backends/rknpu2/rknpu2_backend.cc(82)::GetSDKAndDeviceVersion rknpu2 driver version: 0.7.2
index=0, name=image, n_dims=4, dims=[1, 256, 192, 3], n_elems=147456, size=294912, fmt=NHWC, type=FP16, qnt_type=AFFINE, zp=0, scale=1.000000, pass_through=0
index=0, name=conv2d_441.tmp_1, n_dims=4, dims=[1, 17, 64, 48], n_elems=52224, size=104448, fmt=NCHW, type=FP32, qnt_type=AFFINE, zp=0, scale=1.000000, pass_through=0
[INFO] fastdeploy/runtime/runtime.cc(341)::CreateRKNPU2Backend  Runtime initialized with Backend::RKNPU2 in Device::RKNPU.
[WARNING] fastdeploy/runtime/backends/rknpu2/rknpu2_backend.cc(420)::InitRKNNTensorMemory       The input tensor type != model's inputs type.The input_type need FP16,but inputs[0].type is UINT8
[WARNING] fastdeploy/runtime/backends/rknpu2/rknpu2_backend.cc(420)::InitRKNNTensorMemory       The input tensor type != model's inputs type.The input_type need FP16,but inputs[0].type is UINT8
TinyPose Prediction Done!
[FastDeploy] PPDet in RKNPU2 duration = 0.356904s.

13fengfeng commented 5 months ago

单个模型的日志

[INFO] fastdeploy/runtime/backends/rknpu2/rknpu2_backend.cc(81)::GetSDKAndDeviceVersion rknpu2 runtime version: 2.0.0b0 (35a6907d79@2024-03-24T10:31:14)
[INFO] fastdeploy/runtime/backends/rknpu2/rknpu2_backend.cc(82)::GetSDKAndDeviceVersion rknpu2 driver version: 0.7.2
index=0, name=image, n_dims=4, dims=[1, 256, 192, 3], n_elems=147456, size=294912, fmt=NHWC, type=FP16, qnt_type=AFFINE, zp=0, scale=1.000000, pass_through=0
index=0, name=conv2d_441.tmp_1, n_dims=4, dims=[1, 17, 64, 48], n_elems=52224, size=104448, fmt=NCHW, type=FP32, qnt_type=AFFINE, zp=0, scale=1.000000, pass_through=0
[INFO] fastdeploy/runtime/runtime.cc(341)::CreateRKNPU2Backend  Runtime initialized with Backend::RKNPU2 in Device::RKNPU.
[WARNING] fastdeploy/runtime/backends/rknpu2/rknpu2_backend.cc(420)::InitRKNNTensorMemory       The input tensor type != model's inputs type.The input_type need FP16,but inp
TinyPose Prediction Done!
[FastDeploy] PPDet in RKNPU2 duration = 0.083716s.
KeyPointDetectionResult: [x, y, conf]
187.929306,55.204079, 0.987305
191.976379,47.640800, 0.938965
179.511658,49.086082, 0.925781
197.493729,51.923340, 0.835938
165.700867,53.700344, 0.991699
217.290009,94.789520, 0.940918
150.755127,96.943993, 0.966309
230.973770,151.473999, 0.990234
137.443604,149.259995, 0.899414
244.429489,202.427856, 0.934082
119.086006,189.781876, 0.922852
201.487320,202.904984, 0.887207
171.338181,204.742966, 0.943359
182.912888,277.823486, 0.870605
187.157852,279.173798, 0.825195
172.461395,361.066742, 0.933594
201.424606,356.628784, 0.880859
num_joints:17

PaddlePaddle / FastDeploy

FastDeploy-release-1.0.7 单个模型在70ms作用，piplin需要300-500毫秒 #2469