我怎么才能直接使用Encoding Layer呢?我这里使用pip命令不能成功安装

RSMung commented 3 years ago

您好,我的环境是python3.7 pytorch1.7 torchvision 0.8.1 使用命令pip install git+https://github.com/zhanghang1989/PyTorch-Encoding/后大量报错,我无法解决, 因此求助,我尝试将Encoding Layer相关的代码进行呢copy,但是encoding/lib/中的代码无法进行引用,我应该怎么做呢?

另外我使用pytorch矩阵运算实现了Encoding模块,但是我发现在计算eik的过程中会出现数据爆炸的情况(Nan),不知道您是否有解决办法呢?我的代码详见后续

望指点迷津,非常感谢

class CodeBookBlock(nn.Module):
    def __init__(self, in_channels, c2, out_channels):
        super(CodeBookBlock, self).__init__()
        self.c2 = c2
        self.conv1 = nn.Sequential(
            nn.Conv2d(in_channels, c2, kernel_size=1),
            nn.BatchNorm2d(c2),
            nn.LeakyReLU()
        )
        self.codebook = nn.Parameter(torch. Tensor(c2, Config.K), requires_grad=True)
        self.scale = nn.Parameter(torch.Tensor(Config.K), requires_grad=True)
        self.dp = nn.Dropout(0.5)  # 不能用batchnorm2d,否则会造成fc之后数值全部变成nan
        self.relu = nn.ReLU6()
        self.leakyRelu = nn.LeakyReLU()
        self.fc = nn.Linear(self.c2, out_channels)
        self.sigmoid = nn.Sigmoid()
        self.init_params()  # 初始化参数
        torch.autograd.set_detect_anomaly(True)

    def init_params(self):
        std1 = 1. / ((Config.K * self.c2) ** (1 / 2))
        self.codebook.data.uniform_(-std1, std1)
        self.scale.data.uniform_(-1, 0)

    def forward(self, z):
        """
        :param z: (Batch, c, h, w)
        :return: (Batch, c2)
        """
        batch, c, h, w = z.shape
        N = h * w
        z1 = self.conv1(z)
        z1 = z1.flatten(start_dim=2, end_dim=-1)  # Batch, c2, N
        # print("我是z1")
        # print(z1.shape)
        # print(z1)
        # --------------开始计算放缩因子gama--------------
        # ---处理特征向量z1
        z1 = z1.unsqueeze(2)  # Batch, c2, 1, N
        z1 = z1.repeat(1, 1, Config.K, 1)  # Batch, c2, K, h*w
        # 将z1的K, N(即h*w)交换
        z1 = z1.transpose(2, 3)  # Batch, c2, N, K
        # print("z1")
        # print(z1)
        # ---处理codebook
        d = self.codebook.unsqueeze(1)  # c2, 1, K
        d = d.repeat(1, N, 1)  # c2, N, K
        d = d.unsqueeze(0)  # 1, c2, N, K
        d = d.repeat(batch, 1, 1, 1)  # batch, c2, N, K
        # print("d")
        # print(d.shape)
        # ---计算rik
        rik = z1 - d  # batch, c2, N, K
        # ---计算numerator
        rik = torch.pow(torch.abs(rik), 2)  # 对rik取绝对值并且平方   batch, c2, N, K
        # print(rik.shape)
        # 把scale从1, K变成   batch, c2, N, K
        scale = self.scale.repeat(N, 1)  # N, K
        scale = scale.unsqueeze(0).unsqueeze(0)  # 1, 1, N, K
        scale = scale.repeat(batch, self.c2, 1, 1)  # batch, c2, N, K
        # print(scale.shape)
        # 获得numerator
        # print("我是-scale * rik")
        # print(torch.max(-scale * rik))
        # 这里如果使用exp函数,会造成numerator的数值很大,进而造成后面的变量出现nan, 不用的话Rei可能为0造成后面除法出问题,因此这里改成+某个常数或者leakyRelu
        numerator = self.leakyRelu(-scale * rik)  # batch, c2, N, K
        # print("我是numerator")
        # print(torch.max(numerator))
        Rei = numerator.sum(3)  # eik公式中的分母   batch, c2, N
        # print(Rei.shape)
        # ---开始计算eik,必须在Rei计算完之后
        numerator = numerator * rik  # batch, c2, N, K
        # 将Rei从batch, c2, N变到batch, c2, K, N
        Rei = Rei.unsqueeze(2)  # batch, c2, 1, N
        # print(Rei.shape)
        Rei = Rei.repeat(1, 1, Config.K, 1)  # batch, c2, K, N
        # print(Rei.shape)
        # 将Rei的K, N交换
        Rei = Rei.transpose(2, 3)  # batch, c2, N, K
        # print("我是Rei")
        # print(Rei.shape)
        # # print(Rei)
        # print("Rei的最小值")
        # print(torch.min(Rei))
        # 获得eik
        # print("我是Rei")
        # print(torch.min(Rei))
        eik = numerator / Rei  # batch, c2, N, K
        # print("eik")
        # print(torch.max(eik))
        # 获得ek
        ek = eik.sum(2)  # batch, c2, K
        # print("ek")
        # print(ek.shape)
        # print(ek)
        # print(ek.shape)
        # 获得e
        e = ek.sum(2)  # batch, c2
        e = self.dp(e)
        e = self.relu(e)
        e = self.fc(e)
        # print("我是e")
        # print(torch.max(e))
        gama = self.sigmoid(e)
        # print("我是gama")
        # print(torch.max(gama))
        return gama  # batch, c2

zhanghang1989 commented 3 years ago

如果只需要 Encoding Layer，没必要安装这个 toolkit，这里有 python 版的： https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/ops/encoding.py#L6

RSMung commented 3 years ago

谢谢您的回复,真是太感谢您了!

RSMung commented 3 years ago

作者您好: 我使用了您给的链接中的源码。我想要找您验证一下我的理解是否有误：这个Encoding Layer的输出是一个batch， channel的矩阵，然后我想要请教一下，我下面的操作是正确的吗？此外我观察到您的代码中还有一层norm_layer使用的是SyncBatchNorm，我这里训练是用的单个GPU，我可以替换为BatchNorm吗？

init中的fc的定义
self.fc = nn.Sequential(
            nn.Linear(channels, channels),
            nn.Sigmoid()
        )

forward中对Encoding Layer的输出（encoded_feat）的操作

        # relu后求平均   bt, K, c ---> bt, c
        encoded_feat = F.relu(encoded_feat)
        encoded_feat = encoded_feat.mean(1)  # bt, c
        # fc
        gamma = self.fc(encoded_feat)
        gamma = gamma.view(batch_size, self.channels, 1, 1)  # bt, c, 1, 1
        # channel-wise multiplication
        output = F.relu(x + x * gamma)  # bt, c, h, w

zhanghang1989 commented 3 years ago

单卡可以使用普通BatchNorm

RSMung commented 3 years ago

好的谢谢老师