ceccocats / tkDNN

Deep neural network library and toolkit to do high performace inference on NVIDIA jetson platforms
GNU General Public License v2.0
717 stars 209 forks source link

LOW FPS on Xavier #185

Closed sharoseali closed 2 years ago

sharoseali commented 3 years ago

HI I test my own custom data with 12 classes on yolov4 and export it to yolo4-FP32.rt file. While I converted I observe this error in debugging

====== CUDNN inference ======
Data dim: 1 3 416 416 1
Data dim: 1 51 13 13 1

===== TENSORRT inference ====
Data dim: 1 3 416 416 1
Data dim: 1 51 13 13 1

=== OUTPUT 0 CHECK RESULTS ==
Error reading file yolo4/debug/layer139_out.bin with n of float: 551616 seek: 0 size: 2206464

/home/xavier/Downloads/tkDNN/src/utils.cpp:58
Aborting...

but it generates .rt file when I test it on video. It returns me these results with batch size 1.

Time stats:
Min: 59.4498 ms
Max: 215.051 ms
Avg: 75.7361 ms 13.2037 FPS

and for FP16 Results with batch size 1 are:

Time stats:
Min: 27.9283 ms
Max: 88.9858 ms
Avg: 34.4137 ms 29.0582 FPS

Now here I want to confirm a few things: 1st: FPS is low as reported in the repo. 2nd: How FPS is calculated here and what is your criteria/ Normally in python, we calculate it as 1 / start_time - end_time between inference statements (Can u PLease guide). @ceccocats @mive93 Thanks

mive93 commented 3 years ago

Hi @sharoseali,

check out this: https://github.com/ceccocats/tkDNN/issues/186#issuecomment-765567678 If you still have doubts, feel free to ask.

mive93 commented 2 years ago

Closing for inactivity. Feel free to reopen.