MegEngine / MegDiffusion

MegEngine implementation of Diffusion Models.
Apache License 2.0
16 stars 0 forks source link

About padding in Downsample #7

Closed ChaiByte closed 2 years ago

ChaiByte commented 2 years ago

I'm willing to upload my convert codes, but it doesn't work well after converting. The error between megengine and pytorch implementation are high with the same input. Because of the padding of convolution in Downsample are different, which in pytorch implementation it uses asymmetric padding. Atfter I modified the megengine implmetation, the result:

class DownSample(M.Module):
    """"A downsampling layer with an optional convolution.

    Args:
        in_ch: channels in the inputs and outputs.
        use_conv: if ``True``, apply convolution to do downsampling; otherwise use pooling.
    """""

    def __init__(self, in_ch, with_conv=True):
        super().__init__()
        self.with_conv = with_conv
        if with_conv:
            self.main = M.Conv2d(in_ch, in_ch, 3, stride=2)
        else:
            self.main = M.AvgPool2d(2, stride=2)

    def _initialize(self):
        for module in self.modules():
            if isinstance(module, M.Conv2d):
                init.xavier_uniform_(module.weight)
                init.zeros_(module.bias)

    def forward(self, x, temb):  # add unused temb param here just for convince
        if self.with_conv:
            x = F.nn.pad(x, [*[(0, 0)
                         for i in range(x.ndim - 2)], (0, 1), (0, 1)])
        return self.main(x)

image

Btw, I'm also a beginner in ddpm, your blog helps me a lot!

Originally posted by @Asthestarsfalll in https://github.com/MegEngine/MegDiffusion/issues/5#issuecomment-1193254961

ChaiByte commented 2 years ago

@Asthestarsfalll pesser‘s repo shows how to do the conversion from tf to torch in convert.py. But it's inference steps are not verified. Actually, This model is the source code I refer to and asymmetric padding is not used there. And I do not find asymmetric padding logic in Ho's implementation(tensorflow).

It's so confusing. Can you understand the author's reason for doing this?

ChaiByte commented 2 years ago

You are right. In Tensorflow design, Conv2d's padding behavior is different. I will fix it soon.

ChaiByte commented 2 years ago

Fixed in ddpm model.