TrojanXu / yolov5-tensorrt

A tensorrt implementation of yolov5: https://github.com/ultralytics/yolov5
Apache License 2.0
191 stars 46 forks source link

BatchedNms plugin issue #45

Closed niaoyu closed 3 years ago

niaoyu commented 3 years ago

Hi, in the trt document, The batchedNMSPlugin should take two inputs, boxes input and scores input:

  1. Boxes , shape [batch_size, number_boxes, number_classes, number_box_parameters]
  2. Scores, shape [batch_size, number_boxes, number_classes]

Here in the repo, it is right for scores https://github.com/TrojanXu/yolov5-tensorrt/blob/bab1ff1b5fcf7b8499eb8df1560d50ee205ee891/main.py#L199

But for boxes, why the number_classes=1? https://github.com/TrojanXu/yolov5-tensorrt/blob/bab1ff1b5fcf7b8499eb8df1560d50ee205ee891/main.py#L206

Furthermore, if my class is large than 1- example coco dataset, do i need to broadcast the Boxes dimenson from 1 to 80?

niaoyu commented 3 years ago

Oh, I see you just use the param shareLocation=1, So the classnum should be 1. Am I right?

What‘s more, is it will be OK if I transfer most part of post trt processing to torch, and only keep the trt BatchedNmsPlugin here to make the code simple?

TrojanXu commented 3 years ago

Yes, the bbox can be shared across different classes. Sure. What's more, I think even the TRT batchedNMS can be dropped if you choose to use torch implementation so that you can use fast nms or other nms to meet your requirements.

niaoyu commented 3 years ago

Thanks for your reply. We have the demand that process the yolov5 model both in python and c++. So it will be suitable if wo could process them the same way.

I figure out the transformation yesterday, the performance here: (v100,trt7,yolov5X,batchsize=1,reso640*384,without resizing) FPS torch: 38.62693323274236 FPS tensort: 130.47280364924086