Open zamazan4ik opened 9 months ago
Do you think this is more geared towards TensorRT itself or the PyTorch extension? This might be more relevant to open in https://github.com/nvidia/pytorch
This might be more relevant to open in https://github.com/nvidia/pytorch
For this page, I get HTTP 404. Does it have some special access requirements or just the link is wrong?
Sorry wrong url https://github.com/nvidia/tensorrt
Is your feature request related to a problem? Please describe.
Not a problem. An idea about how the TensorRT performance can be improved.
I checked Profile-Guided Optimization (PGO) and Post-Link Optimization (PLO) improvements on multiple projects. The results are available here. According to the tests, these optimizations can help with achieving better performance in many cases for many applications: compilers and interpreters, static analysis, databases, networking, etc. Since this, I think optimizing TensorRT (its C++ part) with PGO and PLO would be a good idea.
Describe the solution you'd like
I can suggest the following things:
Additional context
As an additional optimization step after PGO, I can suggest Post-Link Optimization (PLO) with a tool like LLVM BOLT. I think it's still worth evaluating it only after the PGO integration into TensorRT.
Here I collected several PGO-related links (more PGO-related materials available at https://github.com/zamazan4ik/awesome-pgo/).
Examples of how PGO optimization is integrated into other projects:
configure
scriptI have some examples of how PGO information looks in the documentation:
Regarding LLVM BOLT integration, I have the following examples: