My tests show that tkDNN is not faster than Darknet, why?

yacad commented 3 years ago

Hi.

I ran a test using about 50 seconds video to compare the speed of darknet and tkDNN. I checked the time taken from the start to the end of the video.

My test environment looks like this :

ubuntu 18.04, RTX3090 (CUDA 11.1, CuDNN 8.0.5, TensorRT v7.22)
Network : yolo4 320

Result :

Darknet : It takes 9.5 seconds
tkDNN fp32 : It takes 15.5 seconds
tkDNN fp16 : It takes 12.4 seconds

Why are these results coming out? Why does tkDNN take longer?

Each program's print statement clearly stated that tkDNN was processing it much faster.

Darknet
tkDNN fp32
tkDNN fp16

Why are these results coming out?

mive93 commented 3 years ago

Hi @yacad

The times and FPS given in output from the demo are the performance of the inference part only, not of all the processing for a single frame. Was also explicit here https://github.com/AlexeyAB/darknet/issues/5354#issuecomment-621722304.

Indeed, this repo want to optimize networks inference and that's where we are faster, but we never claimed to be better in other parts of the processing. Indeed, inference is the only part that is optimized with TensorRT.

Why the rest can be slower?

the demo uses OpenCV
the preprocess and the post-process could be optimized.

The demo that we have here is just a use case to show how to use our library, not the best way to obtain the best performance.

yacad commented 3 years ago

Thank you for answer. As you said, let's test again by optimizing the pre- and post-processing.

mive93 commented 2 years ago

Closing for inactivity. Feel free to reopen.

ceccocats / tkDNN

My tests show that tkDNN is not faster than Darknet, why? #186