From this implementation inference over validation dataset as per below table and the similar implementation provided in triple-Mu/YOLOv8-TensorRT using TRT network builder API which provides NMS plugin as a part of TRT network produces significantly better results as compared to the implementation in this repo. in below comparison table i computed mAP of 3 object categories from COCO dataset, there is huge gap in the mAP across two implementations all the parameters are kept same like (iou_threshold=0.45, conf_threshold=0.30, precision=FP16, input shpae=608x608) while producing the below results.
As it can be seen from below table , the image to the left doesn't detect two persons which certainly getting detected by the implementation in the right image, there are many such images where objects are visible clearly but it's not getting detected with this repo's implementation.
Observations: There are multiple different scenarios in images some of which are very easy to detect but it's not get detected where as the other implementation detects some of the most difficult appearing objects in the image.
Possible Cause: As i tried changing the inference shape while serializing the engine so input shape at resolution 608x608 produces better results as compared to 640x640, from my observations pre-processing function might be the main culprit to this issue, however i tried implementing the pre-processing function as in here while inference, but the results are still identical i am not sure if am missing anything else, any help on this would be greatly appreciated.
From this implementation inference over validation dataset as per below table and the similar implementation provided in triple-Mu/YOLOv8-TensorRT using TRT network builder API which provides NMS plugin as a part of TRT network produces significantly better results as compared to the implementation in this repo. in below comparison table i computed mAP of 3 object categories from COCO dataset, there is huge gap in the mAP across two implementations all the parameters are kept same like (
iou_threshold=0.45
,conf_threshold=0.30
,precision=FP16
,input shpae=608x608
) while producing the below results.As it can be seen from below table , the image to the left doesn't detect two persons which certainly getting detected by the implementation in the right image, there are many such images where objects are visible clearly but it's not getting detected with this repo's implementation.
Observations: There are multiple different scenarios in images some of which are very easy to detect but it's not get detected where as the other implementation detects some of the most difficult appearing objects in the image.
Possible Cause: As i tried changing the inference shape while serializing the engine so input shape at resolution 608x608 produces better results as compared to 640x640, from my observations pre-processing function might be the main culprit to this issue, however i tried implementing the pre-processing function as in here while inference, but the results are still identical i am not sure if am missing anything else, any help on this would be greatly appreciated.