NVIDIA TensorRT-Model-Optimizer issues

NVIDIA / TensorRT-Model-Optimizer

TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillation, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed on NVIDIA GPUs.

https://nvidia.github.io/TensorRT-Model-Optimizer

Other

576 stars 43 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Quantize custom SD1.5 checkpoint

#8 bobbyohyeah closed 6 months ago
1
Has a plan to support of Jetson AGX Orig??

#7 Vaderpucong closed 5 months ago
4
What calib_size means? 512x512 is supported by default by engine? calib_size means the max shape?

#6 bigmover closed 6 months ago
2
How does Model Optimizer compare with the default Tensorrt 10 Optimisations?

#5 yuvraj108c closed 5 months ago
5
What is the difference between nvidia-ammo and TensorRT-Model-Optimizer ?

#4 lix19937 closed 6 months ago
1
Quant tensorrt engine don't achieve advantage in inference speed over fp16 on A100

#3 bigmover closed 5 months ago
10
[E] Uncaught exception detected: Unable to open library: libnvinfer_plugin.so.9 due to libnvinfer_plugin.so.9: cannot open shared object file

#2 bigmover closed 6 months ago
0
Update README.md

#1 kevalmorabia97 closed 6 months ago
0