I converted my darknet weights to single batch inference and 4 batch image inference trt detector respectively. The input resolution for my yolov4 is 1920 vs 1024 for both networks. The resulting inference time I am obtaining on the single batch inference is 0.14 seconds per image on my nvidia 1660 ti gpu. Whereas for the 4 batch inference engine I was obtaining an inference time of 0.53 seconds which is almost equal to the single inference network * 4. Is there very little performance gain in batch inference and if this is normal what use case would serve well for batch inference?
@jkjung-avt
I converted my darknet weights to single batch inference and 4 batch image inference trt detector respectively. The input resolution for my yolov4 is 1920 vs 1024 for both networks. The resulting inference time I am obtaining on the single batch inference is 0.14 seconds per image on my nvidia 1660 ti gpu. Whereas for the 4 batch inference engine I was obtaining an inference time of 0.53 seconds which is almost equal to the single inference network * 4. Is there very little performance gain in batch inference and if this is normal what use case would serve well for batch inference?