Runtime optimization in torch-tensorrt is crucial for maximizing model performance in real-world applications.
This story tracks the effort to improve runtime performance.
Goal(s)
Understand the overhead in cpp/python runtime module and improve the inference performance
Ensure no or minimized impact on accuracy and resource with optimization
TL;DR
Runtime optimization in torch-tensorrt is crucial for maximizing model performance in real-world applications. This story tracks the effort to improve runtime performance.
Goal(s)
Tasks
Additional context