Closed leoxxxxxD closed 8 months ago
补充一下,paddle=2.4.2,tensorrt=8.5.2
使用trtexec
测的话可以根据 GPU Compute Time
估算的
(复现的话可以使用我们这套测速的代码 https://github.com/lyuwenyu/RT-DETR/tree/main/benchmark
@lyuwenyu trtinfer.py中是不是缺了pycuda?还有就是怎么调用呢
你好,请问你测试了yolov8s的推理耗时么,我在T4上怎么测都是v8s要快一些...不知道哪出了问题
@Sssssd 论文中rtdetr的速度暂时还没复现出来,yolo没测,不知道会不会是服务器的原因
@Sssssd 论文中rtdetr的速度暂时还没复现出来,yolo没测,不知道会不会是服务器的原因
我试了下在COCO数据集上得到的模型的测速,在trtexec上测的GPU Compute Time比较符合论文结果,但用trtinfer测的时候还是没拉开差距,v8s比较符合论文的结果,但RTDETR跟v8s还是很接近,没有论文里那么大的优势。但在我自己10个类别的数据集上,RTDETR在trtexec上的测速也基本没有优势,和v8s差不多,trtinfer倒是没什么变化,不知道是什么问题,脑壳疼。
@Sssssd 意思是,你用他们开源的coco权重,复现了论文的fps是吗,我自己复现r18的倒是差了挺多
@Sssssd 意思是,你用他们开源的coco权重,复现了论文的fps是吗,我自己复现r18的倒是差了挺多
只能在trtexec上复现论文fps,trtinfer暂时没复现出来,不知道是不是我用的姿势有问题
模型用的是rtdetr_r18vd_6x_coco,我看您测的fps是217,不知道您看到的是哪个指标计算得到的,我测试的T4上推理的结果如下: [02/23/2024-03:07:42] [I] === Performance summary === [02/23/2024-03:07:42] [I] Throughput: 164.577 qps [02/23/2024-03:07:42] [I] Latency: min = 5.90283 ms, max = 7.3186 ms, mean = 6.05075 ms, median = 6.0332 ms, percentile(90%) = 6.14221 ms, percentile(95%) = 6.26196 ms, percentile(99%) = 6.35991 ms [02/23/2024-03:07:42] [I] Enqueue Time: min = 5.88025 ms, max = 7.29272 ms, mean = 6.0273 ms, median = 6.00977 ms, percentile(90%) = 6.11768 ms, percentile(95%) = 6.23471 ms, percentile(99%) = 6.33209 ms [02/23/2024-03:07:42] [I] H2D Latency: min = 0.800537 ms, max = 0.845398 ms, mean = 0.818772 ms, median = 0.818604 ms, percentile(90%) = 0.821167 ms, percentile(95%) = 0.823975 ms, percentile(99%) = 0.827393 ms [02/23/2024-03:07:42] [I] GPU Compute Time: min = 5.07678 ms, max = 6.48608 ms, mean = 5.22236 ms, median = 5.20569 ms, percentile(90%) = 5.31421 ms, percentile(95%) = 5.43716 ms, percentile(99%) = 5.52832 ms [02/23/2024-03:07:42] [I] D2H Latency: min = 0.00805664 ms, max = 0.0310059 ms, mean = 0.00961827 ms, median = 0.00927734 ms, percentile(90%) = 0.0107727 ms, percentile(95%) = 0.0113831 ms, percentile(99%) = 0.0145264 ms [02/23/2024-03:07:42] [I] Total Host Walltime: 3.01379 s [02/23/2024-03:07:42] [I] Total GPU Compute Time: 2.59029 s