Added TensorRT8 support

ceccocats / tkDNN

Deep neural network library and toolkit to do high performace inference on NVIDIA jetson platforms

GNU General Public License v2.0

718 stars 208 forks source link

Added TensorRT8 support #270

Closed TheExDeus closed 2 years ago

TheExDeus commented 2 years ago

Migrated to TRT8 API, but I didn't really try maintaining old compatibility. Lowest support now could be TRT6.
Refactored some stuff that are bad C++ practices and made coding really hard, like including headers in namespaces or not using an include guard.
Needed to move the yolo container outside tkdnn object, which means we now only have one and global. Deserialization in TRT8 doesn't happen in your object, but in the plugin itself so it couldn't access the yolo objects. I think the need to hold onto yolo layers itself is flawed and shouldn't be necessary.

I tested on an Xavier NX running Jetpack4.6 and at least the yolo4 network runs fine. Haven't tested all of them.

In the future I might do some more refactoring and cleanup, but not sure if then I would diverge a lot from what the maintainers are doing here. I think some performance can be gained from a cleanup and dropping some older compatibility (I think maybe TRT6 or even 7 should be the lowest now).

Also this project really needs a style guide :D

LangArthur commented 2 years ago

Hi!

I'm currently using your fork and got a tiny problem with it.

I'm using the static version of the library. When the static lib is linked to an executable (for example, demos or test executable included with tkdnn), I got the following error:

/usr/bin/ld: libkernels.a(kernels_generated_static_init.cu.o): in function `tk::dnn::YoloRTCreator::YoloRTCreator()':
/path/to/file/tkDNN/include/tkDNN/pluginsRT/YoloRT.h:188: undefined reference to `vtable for tk::dnn::YoloRTCreator'

If it is usefull, I'm using gcc 9.3.0

I found a workaround by removing the default specification on constructor from YoloRTCreator and declaring it as an empty constructor in yoloContainer.cpp. Then I also add src/yoloContainer.cpp at the compilation of kernel. Maybe it should be so in the library ? Or a better alternative exist ?

TheExDeus commented 2 years ago

Thanks for the info! I sadly didn't test static lib. I think your change will do fine. In general I don't think having this global object is a good idea and there usually isn't a need to have one. So the fact that specifically yolo layer is implemented like this is already incorrect, but I don't plan to change that, as then I might as well make most of this from scratch.

mive93 commented 2 years ago

TensorRT8 is now supported on tensorrt8 branch. Every model and data type is properly working.

Thank you @TheExDeus for your work, I have tested and considered it. However, I was already working with @perseusdg to support TRT8 and his implementation was more complete, but some of your choices helped us on completing the porting properly. So, thank you very much :)