[BUG] Padding !=0 leads to inconsistent results for fold

avdhoeke commented 8 months ago

It seems like changing the padding argument from the example makes the torch.nn.functional.fold and FoldNd outputs diverge:

torch.manual_seed(0)

# random output of an im2col operation
inputs = torch.randn(64, 3 * 2 * 2, 5 * 9)
output_size = (4, 8)

# other module hyperparameters
kernel_size = 2
dilation = 1
padding = 1
stride = 1

fold = FoldNd(
    output_size, kernel_size, dilation=dilation, padding=padding, stride=stride
)

torch_outputs = torch.nn.functional.fold(
    inputs, output_size, kernel_size, dilation=dilation, padding=padding, stride=stride
)
fold_outputs = fold(inputs)

# check
if torch.allclose(torch_outputs, fold_outputs):
    print("✔ Outputs of torch.nn.Fold and unfoldNd.FoldNd match.")
else:
    raise AssertionError("❌ Outputs don't match")

What is even more surprising is that those tensors agree except for element 0:

torch_outputs[0][0]
tensor([[-0.2195, -1.2093, -1.1665,  1.4212,  2.1039,  4.7070, -5.2776, -2.8981],
        [-2.6310, -1.5485, -0.6194,  2.2140,  3.6784, -0.2645,  2.1187, -0.3100],
        [-1.8775, -0.0824, -0.3296, -2.2665, -3.8009,  1.2310, -0.2820,  2.9838],
        [ 0.1796, -0.3610,  0.8178, -0.2229,  3.0753,  0.9655,  0.8486,  1.6550]])

fold_outputs[0][0]
tensor([[-1.3622, -1.2093, -1.1665,  1.4212,  2.1039,  4.7070, -5.2776, -2.8981],
        [-2.6310, -1.5485, -0.6194,  2.2140,  3.6784, -0.2645,  2.1187, -0.3100],
        [-1.8775, -0.0824, -0.3296, -2.2665, -3.8009,  1.2310, -0.2820,  2.9838],
        [ 0.1796, -0.3610,  0.8178, -0.2229,  3.0753,  0.9655,  0.8486,  1.6550]])

Any idea where this may come from?

f-dangel commented 8 months ago

Hi,

thanks for reporting this. I've reproduced your bug successfully on this branch.

I did a quick queck of the code to rule out the possibility of the padding argument not being correctly passed downwards.

Need to think about what causes this bug. Any help would be appreciated. It seems like the outputs from PyTorch's fold are much smaller in magnitude.

f-dangel commented 7 months ago

I think I've found and fixed the problem. Could you install from the bug30-fold-with-padding branch and verify that this fixes it?

Cheers, Felix

avdhoeke commented 7 months ago

Problem fixed. I also manually checked using another configuration:

# random output of an im2col operation
inputs = torch.randn(64, 3 * 2 * 2, 7 * 11)
output_size = (4, 8)

# other module hyperparameters
kernel_size = 2
dilation = 1
padding = 2
stride = 1

and both outputs match.

Thanks for the quick and efficient fix!

Cheers,

Arthur

f-dangel commented 7 months ago

No worries, thanks for the clean report! Gonna create a new release today.

f-dangel / unfoldNd

[BUG] Padding !=0 leads to inconsistent results for fold #30