NVIDIA / TensorRT-Incubator

Experimental projects related to TensorRT
81 stars 12 forks source link

Pipe nvtx markers from Tripy to TRT engine #176

Open parthchadha opened 2 months ago

parthchadha commented 2 months ago

Goal: get kernel information for each frontend layer for easier performance debugging and directed unit tests (ex: MHA layer maps to a single fused kernel).

Tripy already passes loc attribute to stablehlo program that gets generated. We need to pipe the attribute information to TRT layer metadata and correlate the kernel information to the frontend APIs.

pranavm-nvidia commented 1 month ago

https://github.com/NVIDIA/TensorRT-Incubator/pull/191