PaddlePaddle / PaddleDetection

Object Detection toolkit based on PaddlePaddle. It supports object detection, instance segmentation, multiple object tracking and real-time multi-person keypoint detection.
Apache License 2.0
12.57k stars 2.86k forks source link

PPYOLO-tiny在Jetson AGX上运行速度问题 #4352

Closed lleesg closed 2 years ago

lleesg commented 2 years ago

环境:AGX、Ubuntu18.04、CUDA10.2、cudnn 8.0.0、jetpack 4.4、python3.6、Paddle2.1.2、TRT7.1

代码:导出PPYOLO-tiny模型后,使用https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/deploy/python/infer.py进行预测

测试方法:AGX运行在MAX_N模式,视频为720p,设置target size为608,测试https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/deploy/python/infer.py#L537:results = detector.predict([frame], FLAGS.threshold)这一句代码的运行耗时,

测试结果:

问题:

  1. 没有查到相关数据,以上测试结果是否符合预期呢?
  2. https://cloud.tencent.com/developer/article/1652975显示YOLOv4-Tiny在同样条件下的帧率为120.5fps,差距怎么这么大
wangxinxin08 commented 2 years ago

目前我们没有在AGX测试过PP-YOLO tiny的性能,不过我们测试过其他的模型在该硬件上的速度:https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.2/deploy/BENCHMARK_INFER.md#2jetson-agx-xavier, 可以看到ssd mobilenet v1的fps已经能达到125FPS左右,PP-YOLO tiny理论上会更快,所以判断测试的fps应该是存在问题的,问题应该在于模型在第一次跑时应该有很严重的launch时间,建议使用PaddleDetection库里面的测试方法加上--run_benchmark=True进行测试,另外,可以看到在jetson agx上的有的模型trt fp32和trt fp16确实没有明显的加速,主要原因应该在于这些模型中有些op因为缺少对应的trt算子并没有跑在trt engine上

paddle-bot-old[bot] commented 2 years ago

Since this issue has not been updated for more than three months, it will be closed, if it is not solved or there is a follow-up one, please reopen it at any time and we will continue to follow up. It is recommended to pull and try the latest code first. 由于该问题超过三个月未更新,将会被关闭,若问题未解决或有后续问题,请随时重新打开(建议先拉取最新代码进行尝试),我们会继续跟进。