Why pytorch batch processing is faster than tensorrt?
Pytorch results
batch size :1
ViT average inference time : 4.116785526275635ms
batch size :2
ViT average inference time : 4.061157703399658ms
batch size :4
ViT average inference time : 4.14365291595459ms
batch size :8
ViT average inference time : 4.671134948730469ms
batch size :16
ViT average inference time : 4.782371520996094ms
batch size :32
ViT average inference time : 5.778961181640625ms
batch size :64
ViT average inference time : 8.487167358398438ms
TensorRT Results
- Batch size=1: `mean = 1.69581 ms (10 iterations)`
- Batch size=2: `mean = 2.72586 ms (10 iterations)`
- Batch size=4: `mean = 4.42871 ms (10 iterations)`
- Batch size=8: `mean = 7.35761 ms (10 iterations)`
- Batch size=16: `mean = 13.8995 ms (10 iterations)`
- Batch size=32: `mean = 26.7512 ms (10 iterations)`
- Batch size=64: `mean = 52.3231 ms (10 iterations)`
Issue.
Why pytorch batch processing is faster than tensorrt?
Pytorch results
TensorRT Results