ceccocats / tkDNN

Deep neural network library and toolkit to do high performace inference on NVIDIA jetson platforms
GNU General Public License v2.0
718 stars 209 forks source link

Here it is. #170

Closed KangGrandesty closed 3 years ago

KangGrandesty commented 3 years ago

Here it is. Results on Xavier AGX, Jetpack 4.3 (CUDA 10.0, CUDNN 7.6.3, tensorrt 6.0.1 ); for yolo4tiny 416x416, on 1200 images of size 416x416.

model precision batch avg (ms) min (ms) max(ms) avg FPS
yolo4tiny fp32 1 6,36684 6,31811 6,48507 157,064
yolo4tiny fp32 4 5,61027 5,58927 5,63641 178,244
yolo4tiny fp16 1 3,48334 3,44269 3,56074 287,081
yolo4tiny fp16 4 2,63374 2,61526 2,65826 379,688
yolo4tiny int8 1 3,13312 3,08334 3,24114 319,17
yolo4tiny int8 4 2,33578 2,32111 2,359 428,122

Originally posted by @mive93 in https://github.com/ceccocats/tkDNN/issues/59#issuecomment-652420971