NVIDIA / TensorRT-Model-Optimizer

TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillation, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed on NVIDIA GPUs.
https://nvidia.github.io/TensorRT-Model-Optimizer
Other
576 stars 43 forks source link

amaxs_values KeyError #47

Open liyandong001 opened 4 months ago

liyandong001 commented 4 months ago

image

cjluo-omniml commented 3 months ago

Could you please provide more info in case we need reproduce?

hsjkdjj commented 1 month ago

have you solved it?