ceccocats / tkDNN

Deep neural network library and toolkit to do high performace inference on NVIDIA jetson platforms
GNU General Public License v2.0
717 stars 209 forks source link

My tests show that tkDNN is not faster than Darknet, why? #186

Closed yacad closed 2 years ago

yacad commented 3 years ago

Hi.

I ran a test using about 50 seconds video to compare the speed of darknet and tkDNN. I checked the time taken from the start to the end of the video.

My test environment looks like this :

  1. ubuntu 18.04, RTX3090 (CUDA 11.1, CuDNN 8.0.5, TensorRT v7.22)
  2. Network : yolo4 320

Result :

  1. Darknet : It takes 9.5 seconds
  2. tkDNN fp32 : It takes 15.5 seconds
  3. tkDNN fp16 : It takes 12.4 seconds

Why are these results coming out? Why does tkDNN take longer?

Each program's print statement clearly stated that tkDNN was processing it much faster.

  1. Darknet image
  2. tkDNN fp32 image
  3. tkDNN fp16 image

Why are these results coming out?

mive93 commented 3 years ago

Hi @yacad

The times and FPS given in output from the demo are the performance of the inference part only, not of all the processing for a single frame. Was also explicit here https://github.com/AlexeyAB/darknet/issues/5354#issuecomment-621722304.

Indeed, this repo want to optimize networks inference and that's where we are faster, but we never claimed to be better in other parts of the processing. Indeed, inference is the only part that is optimized with TensorRT.

Why the rest can be slower?

The demo that we have here is just a use case to show how to use our library, not the best way to obtain the best performance.

yacad commented 3 years ago

Thank you for answer. As you said, let's test again by optimizing the pre- and post-processing.

mive93 commented 2 years ago

Closing for inactivity. Feel free to reopen.