lix19937/tensorrt-insight

TensorRT 是Nvidia 推出的跨 nv-gpu架构的半开源高性能AI 推理引擎框架/库，提供了cpp/python接口，以及用户自定义插件方法，涵盖了AI 推理引擎技术的主要方面。

TensorRT is a semi-open source high-performance AI inference engine framework/library developed by Nvidia, which spans across nv-gpu architectures.
Provides cpp/python interfaces and user-defined plugin methods, covering the main aspects of AI inference engine technology.

topic	主题	notes
overview	概述
layout	内存布局
compute_graph_optimize	计算图优化
dynamic_shape	动态shape
plugin	插件
calibration	标定
asp	稀疏
qat	量化感知训练
trtexec	OSS辅助工具
tool	辅助脚本
runtime	运行时
inferflow	模型调度
mps	MPS
deploy	基于onnx部署流程， trt 工具使用
py-tensorrt	python tensorrt封装	解析 tensorrt `__init__`
cookbook	食谱
incubator	孵化器
developer_guide	开发者指导
triton-inference-server	triton
cuda	cuda编程
onnxruntime op	onnxrt 自定义op	辅助图优化，layer输出对齐

Reference

https://docs.nvidia.com/deeplearning/tensorrt/archives/
https://developer.nvidia.com/search?page=1&sort=relevance&term=
https://github.com/HeKun-NVIDIA/TensorRT-Developer_Guide_in_Chinese/tree/main
https://docs.nvidia.com/deeplearning/tensorrt/migration-guide/index.html
https://developer.nvidia.com/zh-cn/blog/nvidia-gpu-fp8-training-inference/

lix19937 / tensorrt-insight

readme

Reference