Open lulululichuan opened 1 year ago
index 2 is "[return grad_weight, grad_alpha, None, None, None, grad_beta] " None, it correspond to the "[weight, alpha, g, Qn, Qp, beta]" g, g have no gradient and it gradient should be None.
It is the answer about backward. RuntimeError: function ALSQPlusBackward returned a gradient different than None at position
index 2 is "[return grad_weight, grad_alpha, None, None, None, grad_beta] " None, it correspond to the "[weight, alpha, g, Qn, Qp, beta]" g, g have no gradient and it gradient should be None.
Yes, I used the code from your repo , and it shows that the func of ALSQPlus have the "return grad_weight, grad_alpha, None, None, None, grad_beta" originally.
Hi author, thank you for your great work! I meet a problem when using Lsq+_V2:
Traceback (most recent call last): File "train.py", line 273, in <module> train(args) File "train.py", line 201, in train scaler.scale(loss).backward() File "/home/work/ssd1/anaconda3/envs/py38/lib/python3.8/site-packages/torch/_tensor.py", line 396, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs) File "/home/work/ssd1/anaconda3/envs/py38/lib/python3.8/site-packages/torch/autograd/__init__.py", line 173, in backward Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass RuntimeError: Function ALSQPlusBackward returned an invalid gradient at index 2 - got [1] but expected shape compatible with [0]
It seems that the backward func of ALSQPlus has some errors, any advice of how to solve this problem? Thanks in advance!
Hi lulululichuan, have you solved this problem?
多卡会有这个问题,单卡不会 It occurs when training with multi GPUs and is OK with single GPU
Hi author, thank you for your great work! I meet a problem when using Lsq+_V2:
Traceback (most recent call last): File "train.py", line 273, in <module> train(args) File "train.py", line 201, in train scaler.scale(loss).backward() File "/home/work/ssd1/anaconda3/envs/py38/lib/python3.8/site-packages/torch/_tensor.py", line 396, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs) File "/home/work/ssd1/anaconda3/envs/py38/lib/python3.8/site-packages/torch/autograd/__init__.py", line 173, in backward Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass RuntimeError: Function ALSQPlusBackward returned an invalid gradient at index 2 - got [1] but expected shape compatible with [0]
It seems that the backward func of ALSQPlus has some errors, any advice of how to solve this problem? Thanks in advance!