TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillation, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed on NVIDIA GPUs.
That script evaluates the input ONNX model after compiling to TensorRT engine. Looks like your system does not have TensorRT installed. Try using the docker as mentioned here.
onnx_ptq/evaluate_vit.py error: ValueError: Runtime TRT is not supported.