Open smermet opened 2 years ago
This code was picked up in different projects available on GitHub (eg in RobustMFSRforEO) and I still faced the same error.
I modified the code so that the input data corresponds to what is expected for the conv1d
function, hoping not to have introduced an error... Changes are commented with ##
.
def lanczos_shift(img, shift, p=3, a=3):
'''
Shifts an image by convolving it with a Lanczos kernel.
Lanczos interpolation is an approximation to ideal sinc interpolation,
by windowing a sinc kernel with another sinc function extending up to a
few number of its lobes (typically a=3).
Args:
img : tensor (batch_size, channels, height, width), the images to be shifted
shift : tensor (batch_size, 2) of translation parameters (dy, dx)
p : int, padding width prior to convolution (default=3)
a : int, number of lobes in the Lanczos interpolation kernel (default=3)
Returns:
I_s: tensor (batch_size, channels, height, width), shifted images
'''
## These lines are added from RobustMFSRforEO
B, C, H, W = img.shape
# Because examples and channels are interleaved in dim 1.
shift = shift.repeat(C, 1).reshape(B * C, 2)
img = img.view(1, B * C, H, W)
dtype = img.dtype
if len(img.shape) == 2:
img = img[None, None].repeat(1, shift.shape[0], 1, 1) # batch of one image
elif len(img.shape) == 3: # one image per shift
assert img.shape[0] == shift.shape[0]
img = img[None, ]
# Apply padding
padder = torch.nn.ReflectionPad2d(p) # reflect pre-padding
I_padded = padder(img)
I_padded_reshape = I_padded.view(I_padded.shape[0],I_padded.shape[1],-1) ## image reshaped to work with conv1d
# Create 1D shifting kernels
y_shift = shift[:, [0]]
x_shift = shift[:, [1]]
k_y = (lanczos_kernel(y_shift, a=a, N=None, dtype=dtype)
.flip(1) # flip axis of convolution
)[:, None, :, None].squeeze(3) ## expand dims to get shape (batch, channels, y_kernel) instead of (batch, channels, y_kernel, 1)
k_x = (lanczos_kernel(x_shift, a=a, N=None, dtype=dtype)
.flip(1)
)[:, None, None, :].squeeze(2) ## shape (batch, channels, x_kernel) instead of (batch, channels, 1, x_kernel)
# Apply kernels
I_s = torch.conv1d(I_padded_reshape,
groups=k_y.shape[0],
weight=k_y,
padding=k_y.shape[2] // 2) ## previously : [k_y.shape[2] // 2, 0]
I_s = torch.conv1d(I_s,
groups=k_x.shape[0],
weight=k_x,
padding=k_x.shape[2] // 2) ## previously : [0, k_x.shape[3] // 2]
I_s = I_s.view(B, C, H+2*p, W+2*p) ## result reshaped in image format
I_s = I_s[..., p:-p, p:-p] # remove padding
return I_s
Plus : in DeepNetworks/ShiftNet.py I removed the .transpose(0, 1)
whose format does not correspond to the documentation
Hi, have you tested your solution since then? I encountered the same error, I'm currently running the training process with your changes, but there's nothing to compare the results with, since there's no pretrained model, sadly
Indeed, this repository does not contain pre-trained models. After a quick review, I see that this competing architecture also uses ShiftNet, with the same code for lanzcos_shift, this time with pre-trained models that can be used for comparison: https://github.com/rarefin/MISR-GRU
I had tested my proposal, but the results were worse using the ShiftNet than without... And below the values reported in the publication... I had given up hoping for some interaction on this forum to help me move forward!
On rereading my proposal, it seems obvious that the result of the shift cannot be applied twice, for x and y, to the same flattened matrix. I therefore complete my previous proposal with a slight modification between the two 1d convolutions. The image is reconstructed after the x-correction to be flattened again, this time to accommodate the y-correction.
... Unfortunately the results are not improved, some confusion must be introduced somewhere!
def lanczos_shift(img, shift, p=3, a=3):
'''
Shifts an image by convolving it with a Lanczos kernel.
Lanczos interpolation is an approximation to ideal sinc interpolation,
by windowing a sinc kernel with another sinc function extending up to a
few number of its lobes (typically a=3).
Args:
img : tensor (batch_size, channels, height, width), the images to be shifted
shift : tensor (batch_size, 2) of translation parameters (dy, dx)
p : int, padding width prior to convolution (default=3)
a : int, number of lobes in the Lanczos interpolation kernel (default=3)
Returns:
I_s: tensor (batch_size, channels, height, width), shifted images
'''
B, C, H, W = img.shape
## Because examples and channels are interleaved in dim 1.
shift = shift.repeat(C, 1).reshape(B * C, 2)
img = img.view(1, B * C, H, W)
dtype = img.dtype
if len(img.shape) == 2:
img = img[None, None].repeat(1, shift.shape[0], 1, 1) # batch of one image
elif len(img.shape) == 3: # one image per shift
assert img.shape[0] == shift.shape[0]
img = img[None, ]
# Apply padding
padder = torch.nn.ReflectionPad2d(p) # reflect pre-padding
I_padded = padder(img)
I_padded_reshapeX = I_padded.view(I_padded.shape[0],I_padded.shape[1],-1) ## The images are flattened
# Create 1D shifting kernels
y_shift = shift[:, [0]]
x_shift = shift[:, [1]]
k_y = (lanczos_kernel(y_shift, a=a, N=None, dtype=dtype)
.flip(1) # flip axis of convolution
)[:, None, :, None].squeeze(3) # expand dims to get shape (batch, channels, y_kernel, 1)
k_x = (lanczos_kernel(x_shift, a=a, N=None, dtype=dtype)
.flip(1)
)[:, None, None, :].squeeze(2) # shape (batch, channels, 1, x_kernel)
# Apply kernels
I_s_reshapeX = torch.conv1d(I_padded_reshapeX, #.permute(1, 0, 2, 3)
groups=k_y.shape[0],
weight=k_y,
padding=k_y.shape[2] // 2) # same padding # [k_y.shape[2] // 2, 0]
I_s = I_s_reshapeX.view(1, B*C, H + 2 * p, W + 2 * p) ## Reconstruction of the padded image
I_s_reshapeY = I_s_reshapeY = I_s.transpose(2,3).reshape(I_s.shape[0],I_s.shape[1],-1) ## The images are flattened after inversion of the width and heigth
I_s_reshapeY = torch.conv1d(I_s_reshapeY,
groups=k_x.shape[0],
weight=k_x,
padding=k_x.shape[2] // 2) #[0, k_x.shape[3] // 2]
I_s = I_s_reshapeY.reshape(B, C, W+2*p, H+2*p).transpose(2,3) ## Reconstruction of the image and re-inversion of width and height
I_s = I_s[..., p:-p, p:-p] # remove padding
return I_s # , k.squeeze()
I think perhaps I have solved it.In my option,he wants to use convolution in single dimension.So he use a list[0, k_x.shape[3] // 2] in an image.the answer is to use 2d conv not 1d conv.And you should solve an inplace operation in network,which caused BP of network failed.
I think perhaps I have solved it.In my option,he wants to use convolution in single dimension.So he use a list[0, k_x.shape[3] // 2] in an image.the answer is to use 2d conv not 1d conv.And you should solve an inplace operation in network,which caused BP of network failed.
Can you show us how you solved it? Cause I did change it to 2d conv, but I got this issue:
0%| | 0/400 [00:00<?, ?it/s]C:\Users\Юнсок\AppData\Local\Programs\Python\Python310\lib\site-packages\imageio\plugins\pillow.py:320: UserWarning: Loading 16-bit (uint16) PNG as int32 due to limitations in pillow's PNG decoder. This will be fixed in a future version of pillow which will make this warning dissapear. warnings.warn(
torch.Size([1, 8, 202, 202])
C:\Users\Юнсок\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\autograd__init.py:251: UserWarning: Error detected in MeanBackward1. Traceback of forward call that caused the error:
File "C:\Users\Юнсок\Desktop\Research\MISR\HighRes-net-master\src\train.py", line 309, in
I think perhaps I have solved it.In my option,he wants to use convolution in single dimension.So he use a list[0, k_x.shape[3] // 2] in an image.the answer is to use 2d conv not 1d conv.And you should solve an inplace operation in network,which caused BP of network failed.
Can you show us how you solved it? Cause I did change it to 2d conv, but I got this issue: 0%| | 0/400 [00:00<?, ?it/s]C:\Users\Юнсок\AppData\Local\Programs\Python\Python310\lib\site-packages\imageio\plugins\pillow.py:320: UserWarning: Loading 16-bit (uint16) PNG as int32 due to limitations in pillow's PNG decoder. This will be fixed in a future version of pillow which will make this warning dissapear. warnings.warn( torch.Size([1, 8, 202, 202]) C:\Users\Юнсок\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\autogradinit.py:251: UserWarning: Error detected in MeanBackward1. Traceback of forward call that caused the error: File "C:\Users\Юнсок\Desktop\Research\MISR\HighRes-net-master\src\train.py", line 309, in main(config) File "C:\Users\Юнсок\Desktop\Research\MISR\HighRes-net-master\src\train.py", line 295, in main trainAndGetBestModel(fusion_model, regis_model, optimizer, dataloaders, baseline_cpsnrs, config) File "C:\Users\Юнсок\Desktop\Research\MISR\HighRes-net-master\src\train.py", line 177, in trainAndGetBestModel shifts = register_batch(regis_model, File "C:\Users\Юнсок\Desktop\Research\MISR\HighRes-net-master\src\train.py", line 40, in register_batch theta = shiftNet(torch.cat([reference, lrs[:, i : i + 1]], 1)) File "C:\Users\Юнсок\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "C:\Users\Юнсок\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl return forward_call(args, **kwargs) File "C:\Users\Юнсок\Desktop\Research\MISR\HighRes-net-master\src\DeepNetworks\ShiftNet.py", line 66, in forward x[:, 1] = x[:, 1] - torch.mean(x[:, 1], dim=(1, 2)).view(-1, 1, 1) (Triggered internally at ..\torch\csrc\autograd\python_anomaly_mode.cpp:119.) Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 0%| | 0/131 [00:10<?, ?it/s] 0%| | 0/400 [00:10<?, ?it/s] Traceback (most recent call last): File "C:\Users\Юнсок\Desktop\Research\MISR\HighRes-net-master\src\train.py", line 309, in main(config) File "C:\Users\Юнсок\Desktop\Research\MISR\HighRes-net-master\src\train.py", line 295, in main trainAndGetBestModel(fusion_model, regis_model, optimizer, dataloaders, baseline_cpsnrs, config) File "C:\Users\Юнсок\Desktop\Research\MISR\HighRes-net-master\src\train.py", line 190, in trainAndGetBestModel loss.backward() File "C:\Users\Юнсок\AppData\Local\Programs\Python\Python310\lib\site-packages\torch_tensor.py", line 492, in backward torch.autograd.backward( File "C:\Users\Юнсок\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\autogradinit.py", line 251, in backward Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [8, 128, 128]], which is output 0 of AsStridedBackward0, is at version 2; expected version 1 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
This is because of inplace operation in pytorch.I remembered it appeared in shiftnet.py, you can try to fix it.But I get a bad result,so I gave up.
I want to use your code with the Proba-V dataset, but I'm facing the following error.
$ python src/train.py --config config/config.json 0%| | 0/261 [00:00<?, ?it/s] 0%| | 0/400 [00:00<?, ?it/s] Traceback (most recent call last): File "[...]/HighRes-net/src/train.py", line 308, in <module> main(config) File "[...]/HighRes-net/src/train.py", line 294, in main trainAndGetBestModel(fusion_model, regis_model, optimizer, dataloaders, baseline_cpsnrs, config) File "[...]/HighRes-net/src/train.py", line 180, in trainAndGetBestModel srs_shifted = apply_shifts(regis_model, srs, shifts, device)[:, 0] File "[...]/HighRes-net/src/train.py", line 61, in apply_shifts new_images = shiftNet.transform(thetas, images, device=device) File "[...]/HighRes-net/src/DeepNetworks/ShiftNet.py", line 96, in transform new_I = lanczos.lanczos_shift(img=I.transpose(0, 1), File "[...]/HighRes-net/src/lanczos.py", line 96, in lanczos_shift I_s = torch.conv1d(I_padded, RuntimeError: Expected 2D (unbatched) or 3D (batched) input to conv1d, but got input of size: [1, 4, 202, 202]
Here are the different values or shapes which are passed in the conv1d function :
I_padded
input shape :torch.Size([1, 4, 202, 202])
k_y.shape[0]
andk_x.shape[0]
groups number :4
k_y
andk_x
weights shapes :torch.Size([4, 1, 7, 1])
(andtorch.Size([4, 1, 1, 7])
)[k_y.shape[2] // 2, 0]
and[0, k_x.shape[3] // 2]
padding values :[3, 0]
and[3, 0]
I used the default config.json, except for the following parameters.
I tried to squeeze the 1st dim of img, the 2nd of weights and to specify a simple int value for padding to avoid the different error messages, but all I finally had is this new RuntimeError. 'Given groups=4, weight of size [4, 7, 1], expected input[4, 202, 202] to have 28 channels, but got 202 channels instead'
Any clue to help me?