📖 [Story] Optimize the launch overhead of TRT engine and pytorch kernels

TL;DR

Runtime optimization in torch-tensorrt is crucial for maximizing model performance in real-world applications. This story tracks the effort to improve runtime performance.

Goal(s)

Understand the overhead in cpp/python runtime module and improve the inference performance
Ensure no or minimized impact on accuracy and resource with optimization

Tasks

### Tasks
- [ ] https://github.com/pytorch/TensorRT/pull/3276
- [ ] https://github.com/pytorch/TensorRT/issues/3277

Additional context

### Tasks

pytorch / TensorRT

📖 [Story] Optimize the launch overhead of TRT engine and pytorch kernels #3274

TL;DR

Goal(s)

Tasks

Additional context