NVIDIA / TensorRT-Model-Optimizer

TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillation, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed on NVIDIA GPUs.
https://nvidia.github.io/TensorRT-Model-Optimizer
Other
580 stars 44 forks source link

Why Smoothed 0 modules when quantize SD1.5 #61

Closed 2524378044 closed 2 weeks ago

2524378044 commented 3 months ago

LOAD 截图20240826112225

2524378044 commented 3 months ago

=lora_scale) File "/root/anaconda3/envs/qdiff/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "/root/anaconda3/envs/qdiff/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, *kwargs) File "/root/anaconda3/envs/qdiff/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _slow_forward result = self.forward(input, kwargs) File "/root/anaconda3/envs/qdiff/lib/python3.8/site-packages/diffusers/models/resnet.py", line 225, in forward hidden_states = self.conv1(hidden_states, scale) if not USE_PEFT_BACKEND else self.conv1(hidden_states) File "/root/anaconda3/envs/qdiff/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "/root/anaconda3/envs/qdiff/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, *kwargs) File "/root/anaconda3/envs/qdiff/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _slow_forward result = self.forward(input, kwargs) File "/root/anaconda3/envs/qdiff/lib/python3.8/site-packages/modelopt/torch/quantization/nn/modules/quant_module.py", line 85, in forward return super().forward(input, *args, kwargs) File "/root/anaconda3/envs/qdiff/lib/python3.8/site-packages/modelopt/torch/quantization/nn/modules/quant_module.py", line 39, in forward input = self.input_quantizer(input) File "/root/anaconda3/envs/qdiff/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/root/anaconda3/envs/qdiff/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(args, kwargs) File "/root/anaconda3/envs/qdiff/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _slow_forward result = self.forward(*input, **kwargs) File "/root/anaconda3/envs/qdiff/lib/python3.8/site-packages/modelopt/torch/quantization/nn/modules/tensor_quantizer.py", line 548, in forward self._check_onnx_readiness(inputs) File "/root/anaconda3/envs/qdiff/lib/python3.8/site-packages/modelopt/torch/quantization/nn/modules/tensor_quantizer.py", line 377, in _check_onnx_readiness assert hasattr(self, "_amax"), ( AssertionError: Quantizer has not been calibrated. ONNX export requires the quantizer to be calibrated.Calibrate and load amax before exporting to ONNX.

leya516 commented 3 months ago

@2524378044 Update nvidia-modelopt to 0.15.1 solved my problem

2524378044 commented 3 months ago

@2524378044 Update nvidia-modelopt to 0.15.1 solved my problem

it does work!thanks a lot!

jingyu-ml commented 2 weeks ago

Closing this issue; feel free to reopen if necessary.