TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillation, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed on NVIDIA GPUs.
I have just installed the TensorRT-Model_Optimizer using pip install "nvidia-modelopt[all]" --no-cache-dir --extra-index-url https://pypi.nvidia.com. I was then using it to quantize a onnx model following the documentation listed here. I have also installed trt10 using pip install tensorrt. However, when i run
import modelopt.onnx.quantization as moq
moq.quantize(
onnx_path=onnx_path,
calibration_data=calibration_data,
output_path="quant.onnx",
quantize_mode="int8",
)
I got AttributeError: module 'modelopt.onnx.quantization' has no attribute 'quantize'. Could you please advise?
Hi,
I have just installed the TensorRT-Model_Optimizer using
pip install "nvidia-modelopt[all]" --no-cache-dir --extra-index-url https://pypi.nvidia.com
. I was then using it to quantize a onnx model following the documentation listed here. I have also installed trt10 usingpip install tensorrt
. However, when i runI got
AttributeError: module 'modelopt.onnx.quantization' has no attribute 'quantize'
. Could you please advise?Thanks!