Closed chandan-labelfuse closed 1 year ago
Hi! Can you please attach the full stack trace, so we can diagnose it better?
Hi,
Thank you for your reply. I have attached the full stack trace below. Let me know if you need anything else.
/usr/local/lib/python3.9/site-packages/pytorch_quantization/nn/modules/tensor_quanti
zer.py:284: TracerWarning: Converting a tensor to a Python boolean might cause the t
race to be incorrect. We can't record the data flow of Python values, so this value
will be treated as a constant in the future. This means that the trace might not gen
eralize to other inputs!
if amax.numel() == 1:
/usr/local/lib/python3.9/site-packages/pytorch_quantization/nn/modules/tensor_quanti
zer.py:286: TracerWarning: Converting a tensor to a Python number might cause the tr
ace to be incorrect. We can't record the data flow of Python values, so this value w
ill be treated as a constant in the future. This means that the trace might not gene
ralize to other inputs!
inputs, amax.item() / bound, 0,
/usr/local/lib/python3.9/site-packages/pytorch_quantization/utils/reduce_amax.py:61:
TracerWarning: Converting a tensor to a Python boolean might cause the trace to be
incorrect. We can't record the data flow of Python values, so this value will be tre
ated as a constant in the future. This means that the trace might not generalize to
other inputs!
if not keepdims or output.numel() == 1:
/usr/local/lib/python3.9/site-packages/pytorch_quantization/nn/modules/tensor_quanti
zer.py:292: TracerWarning: Converting a tensor to a Python boolean might cause the t
race to be incorrect. We can't record the data flow of Python values, so this value
will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
quant_dim = list(amax.shape).index(list(amax_sequeeze.shape)[0])
Traceback (most recent call last):
File "/code/app/quantization/yolo_nas_quantization.py", line 29, in <module>
export_quantized_module_to_onnx(
File "/usr/local/lib/python3.9/site-packages/super_gradients/training/utils/quantization/export.py", line 53, in export_quantized_module_to_onnx
torch.onnx.export(export_model, dummy_input, onnx_filename, verbose=False, opset_version=13, do_constant_folding=True, training=training_mode)
File "/usr/local/lib/python3.9/site-packages/torch/onnx/utils.py", line 504, in export
_export(
File "/usr/local/lib/python3.9/site-packages/torch/onnx/utils.py", line 1529, in _export
graph, params_dict, torch_out = _model_to_graph(
File "/usr/local/lib/python3.9/site-packages/torch/onnx/utils.py", line 1111, in _model_to_graph
graph, params, torch_out, module = _create_jit_graph(model, args)
File "/usr/local/lib/python3.9/site-packages/torch/onnx/utils.py", line 987, in _create_jit_graph
graph, torch_out = _trace_and_get_graph_from_model(model, args)
File "/usr/local/lib/python3.9/site-packages/torch/onnx/utils.py", line 891, in _trace_and_get_graph_from_model
trace_graph, torch_out, inputs_states = torch.jit._get_trace_graph(
File "/usr/local/lib/python3.9/site-packages/torch/jit/_trace.py", line 1184, in _get_trace_graph
outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.9/site-packages/torch/jit/_trace.py", line 127, in forward
graph, out = torch._C._create_graph_by_tracing(
File "/usr/local/lib/python3.9/site-packages/torch/jit/_trace.py", line 118, in wrapper
outs.append(self.inner(*trace_inputs))
File "/usr/local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1182, in _slow_forward
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.9/site-packages/super_gradients/training/models/detection_models/customizable_detector.py", line 85, in forward
x = self.neck(x)
File "/usr/local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1182, in _slow_forward
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.9/site-packages/super_gradients/training/models/detection_models/yolo_nas/panneck.py", line 59, in forward
x_n1_inter, x = self.neck1([c5, c4, c3])
File "/usr/local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1182, in _slow_forward
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.9/site-packages/super_gradients/training/models/detection_models/yolo_nas/yolo_stages.py", line 269, in forward
x = self.upsample(x_inter)
File "/usr/local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1182, in _slow_forward
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.9/site-packages/pytorch_quantization/nn/modules/quant_conv.py", line 342, in forward
output_padding = self._output_padding(input, output_size, self.stride, self.padding, self.kernel_size)
TypeError: _output_padding() missing 1 required positional argument: 'num_spatial_dims'
It has something to do with torch/pytorch-quantization internal libraries as mentioned in #issues/2964. I also had this problem but i got it working by using torch==1.11.0
, I have to downgrade torch until it works. Hopefully they'll can solve it in the new release soon
@haritsahm Downgrading the torch version worked. Closing the issue for now, thank you.
Please keep it open. I just came here to say I was hit by this bug while trying to qat YOLO-NAS-S
Could be related to ONNX opset version that is set in pytorch when exporting ONNX. That could explain why downgrading torch helps.
In short, this is not a bug on SG side. For now, you can downgrade torch to 1.11, but as a long-term solution we will be making a PR to pytorch_quantization to support newer versions of pytorch.
We have merged a fix to master to patch pytorch_quantization on the fly, so in 3.2 release you you be able to use it without issue.
We just released https://github.com/Deci-AI/super-gradients/releases/tag/3.2.0 and you are more than welcome to try out the new export API we made for object detection models: https://github.com/Deci-AI/super-gradients/blob/master/documentation/source/models_export.md
Please keep it open. I just came here to say I was hit by this bug while trying to qat YOLO-NAS-S
Hi I was facing the same issue as when i was trying to QAT yolo_nas_s with custom dataset with yolo format and i fixed it with this..... uninstall all your virtual environment libraries and install the requirements.txt after u do that pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113 &> /dev/null pip install pytorch-quantization==2.1.2 --extra-index-url https://pypi.ngc.nvidia.com &> /dev/null
this should solve your problems.
Hello, I was facing the same issue and saw that downgrading the PyTorch version to 1.11.0 resolves the issue. However, due to the need to maintain a higher version in my case, this solution was not feasible for me. Instead, I was able to resolve the problem using two different methods:
Directly modifying the 'pytorch_quantization' module: Modify the line in 'pytorch_quantization/nn/modules/quant_conv.py', line 341:
output_padding = self._output_padding(input, output_size, self.stride, self.padding, self.kernel_size)
to
output_padding = self._output_padding(input, output_size, self.stride, self.padding, self.kernel_size, input.dim() - 2)
Patch within the ONNX conversion script: Add the following part to patch the '_output_padding' method
from pytorch_quantization.nn.modules.quant_conv import QuantConvTranspose2d
original_output_padding = QuantConvTranspose2d._output_padding
def patched_output_padding(self, input, output_size, stride, padding, kernel_size): num_spatial_dims = len(kernel_size) return original_output_padding(self, input, output_size, stride, padding, kernel_size, num_spatial_dims)
QuantConvTranspose2d._output_padding = patched_output_padding
model = models.get("yolo_nas_m", pretrained_weights="coco").cuda() model = model.eval()
q_util = SelectiveQuantizer( default_quant_modules_calibrator_weights="max", default_quant_modules_calibrator_inputs="histogram", default_per_channel_quant_weights=True, default_learn_amax=False, verbose=True, ) q_util.quantize_module(model) ...
These solutions worked well with versions 1.12.0 and 2.3.0. My pytorch_quantization version is 2.1.0.
🐛 Describe the bug
The yolo-nas model is quantized using the documentation given at https://docs.deci.ai/super-gradients/documentation/source/ptq_qat.html#post-training-quantization
After the quantization, the quantized model is converted into onnx format using
However, this gives an error,
Seems like it is an issue with the underlying tracing library of pytorch. I am not sure how to correct this or patch it.
Versions
Collecting environment information... PyTorch version: 1.13.1+cu117 Is debug build: False CUDA used to build PyTorch: 11.7 ROCM used to build PyTorch: N/A
OS: Ubuntu 22.04.2 LTS (x86_64) GCC version: (Ubuntu 11.3.0-1ubuntu1~22.04.1) 11.3.0 Clang version: Could not collect CMake version: version 3.26.3 Libc version: glibc-2.35
Python version: 3.10.6 (main, Mar 10 2023, 10:55:28) [GCC 11.3.0] (64-bit runtime) Python platform: Linux-5.15.0-1031-aws-x86_64-with-glibc2.35 Is CUDA available: True CUDA runtime version: Could not collect CUDA_MODULE_LOADING set to: LAZY GPU models and configuration: GPU 0: Tesla V100-SXM2-16GB Nvidia driver version: 530.30.02 cuDNN version: Could not collect HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True
CPU: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 46 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Vendor ID: GenuineIntel Model name: Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz CPU family: 6 Model: 79 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 1 Stepping: 1 CPU max MHz: 3000.0000 CPU min MHz: 1200.0000 BogoMIPS: 4600.02 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single pti fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx xsaveopt Hypervisor vendor: Xen Virtualization type: full L1d cache: 128 KiB (4 instances) L1i cache: 128 KiB (4 instances) L2 cache: 1 MiB (4 instances) L3 cache: 45 MiB (1 instance) NUMA node(s): 1 NUMA node0 CPU(s): 0-7 Vulnerability Itlb multihit: KVM: Mitigation: VMX unsupported Vulnerability L1tf: Mitigation; PTE Inversion Vulnerability Mds: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown Vulnerability Meltdown: Mitigation; PTI Vulnerability Mmio stale data: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown Vulnerability Retbleed: Not affected Vulnerability Spec store bypass: Vulnerable Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Vulnerability Spectre v2: Mitigation; Retpolines, STIBP disabled, RSB filling, PBRSB-eIBRS Not affected Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown
Versions of relevant libraries: [pip3] numpy==1.23.0 [pip3] pytorch-quantization==2.1.2 [pip3] torch==1.13.1 [pip3] torchmetrics==0.8.0 [pip3] torchvision==0.14.1 [pip3] triton==2.0.0 [pip3] tritonclient==2.33.0 [conda] Could not collect