Closed steinate closed 6 months ago
I also test the FPS on NVIDIA RTX 3090 GPU with batch size of 1.
Remove YOLOXHeadCustom Head Adjust workers_per_gpu
Remove YOLOXHeadCustom Head Adjust workers_per_gpu
Actually, in test mode, forward_roi_head
doesn't consume much time:
def forward_roi_head(self, **data):
if (self.aux_2d_only and not self.training) or not self.with_img_roi_head:
return {'topk_indexes':None}
else:
outs_roi = self.img_roi_head(**data)
return outs_roi
Even comment this line, the running speed didn't become faster.
def simple_test_pts(self, img_metas, **data):
"""Test function of point cloud branch."""
# outs_roi = self.forward_roi_head(**data)
if img_metas[0]['scene_token'] != self.prev_scene_token:
self.prev_scene_token = img_metas[0]['scene_token']
data['prev_exists'] = data['img'].new_zeros(1)
self.pts_bbox_head.reset_memory()
else:
data['prev_exists'] = data['img'].new_ones(1)
outs = self.pts_bbox_head(img_metas, **data)
bbox_list = self.pts_bbox_head.get_bboxes(
outs, img_metas)
bbox_results = [
bbox3d2result(bboxes, scores, labels)
for bboxes, scores, labels in bbox_list
]
return bbox_results
I still get the following result:
Done image [50 / 300], fps: 10.3 img / s
Done image [100/ 300], fps: 10.3 img / s
Done image [150/ 300], fps: 10.3 img / s
Done image [200/ 300], fps: 10.3 img / s
Done image [250/ 300], fps: 10.3 img / s
Done image [300/ 300], fps: 10.3 img / s
Overall fps: 10.3 img / s
return_intermediate=False Only need the final layer output when testing. In addition, the script I wrote includes data processing time, which may be related to your cpu performance.
after running
python tools/benchmark.py projects/configs/test_speed/repdetr3d_vov_800_bs2_seq_24e.py
, got the following output:which is a little slower than the FPS-pytorch
13.1
in given Results on NuScenes Val Set. Is any difference between config file in speed measurement and config file used in training and inference? Below is my config file(workers_per_gpu is set to the fastest configuration).