NJUVISION / NIC

End-to-End Learnt Image Compression Codec
50 stars 7 forks source link

测试遇到问题,请求解答QAQ #4

Open eecoder-dyf opened 2 years ago

eecoder-dyf commented 2 years ago

在用你们的代码测试时,遇到以下问题: (测试命令:python -W ignore inference.py -i example.png -o 1.bin -m_dir ./ckpts -m 3 --encode

Traceback (most recent call last):
  File "inference.py", line 363, in <module>
    encode(args.input, args.output, args.model_dir, args.model, args.block_width, args.block_height)
  File "/home/dyf/anaconda3/envs/c2f/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "inference.py", line 89, in encode
    xp3, params_prob = context(y_main_q, hyper_dec)
  File "/home/dyf/anaconda3/envs/c2f/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/d/dev/NIC/code/Model/context_model.py", line 112, in forward
    p3 = self.gaussin_entropy_func(torch.squeeze(x, dim=1), output)
  File "/home/dyf/anaconda3/envs/c2f/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/d/dev/NIC/code/Model/gaussian_entropy_model.py", line 98, in forward
    m0 = torch.distributions.normal.Normal(mean0, scale0)
  File "/home/dyf/anaconda3/envs/c2f/lib/python3.8/site-packages/torch/distributions/normal.py", line 50, in __init__
    super(Normal, self).__init__(batch_shape, validate_args=validate_args)
  File "/home/dyf/anaconda3/envs/c2f/lib/python3.8/site-packages/torch/distributions/distribution.py", line 56, in __init__
    raise ValueError(
ValueError: Expected parameter scale (Tensor of shape (1, 192, 32, 48)) of distribution Normal(loc: torch.Size([1, 192, 32, 48]), scale: torch.Size([1, 192, 32, 48])) to satisfy the constraint GreaterThan(lower_bound=0.0), but found invalid values:

分析报错,发现gaussian_entropy_model中:

class Distribution_for_entropy2(nn.Module):
    def __init__(self):
        super(Distribution_for_entropy2, self).__init__()

    def forward(self, x, p_dec):
        # you can use use 3 gaussian
        prob0, mean0, scale0, prob1, mean1, scale1, prob2, mean2, scale2 = [
            torch.chunk(p_dec, 9, dim=1)[i].squeeze(1) for i in range(9)]
        # keep the weight  summation of prob == 1
        probs = torch.stack([prob0, prob1, prob2], dim=-1)
        probs = f.softmax(probs, dim=-1)
        # process the scale value to non-zero
        scale0[scale0 == 0] = 1e-6
        scale1[scale1 == 0] = 1e-6
        scale2[scale2 == 0] = 1e-6
        # 3 gaussian distribution

        m0 = torch.distributions.normal.Normal(mean0, scale0)
        m1 = torch.distributions.normal.Normal(mean1, scale1)
        m2 = torch.distributions.normal.Normal(mean2, scale2)

        likelihood0 = torch.abs(m0.cdf(x + 0.5)-m0.cdf(x-0.5))
        likelihood1 = torch.abs(m1.cdf(x + 0.5)-m1.cdf(x-0.5))
        likelihood2 = torch.abs(m2.cdf(x + 0.5)-m2.cdf(x-0.5))

        likelihoods = Low_bound.apply(
            probs[:, :, :, :, 0]*likelihood0+probs[:, :, :, :, 1]*likelihood1+probs[:, :, :, :, 2]*likelihood2)

        return likelihoods

其中的scale0, scale1scale2部分出现了小于0的值,众所周知,方差是不会小于0的,查找输入的p_dec,发现对应context_model.py中的output:

class Weighted_Gaussian(nn.Module):
    def __init__(self, M):
        super(Weighted_Gaussian, self).__init__()
        self.conv1 = MaskConv3d('A', 1, 24, 11, 1, 5)
        self.conv2 = nn.Sequential(nn.Conv3d(25, 48, 1, 1, 0), nn.ReLU(), nn.Conv3d(48, 96, 1, 1, 0), nn.ReLU(),
                                   nn.Conv3d(96, 9, 1, 1, 0))
        self.conv3 = nn.Conv2d(M*2, M, 3, 1, 1)

        self.gaussin_entropy_func = Distribution_for_entropy2()

    def forward(self, x, hyper):
        x = torch.unsqueeze(x, dim=1)
        hyper = torch.unsqueeze(self.conv3(hyper), dim=1)
        x1 = self.conv1(x)
        output = self.conv2(torch.cat((x1, hyper), dim=1))
        p3 = self.gaussin_entropy_func(torch.squeeze(x, dim=1), output)
        return p3, output

请问我该如何让测试跑起来?

tongxyh commented 2 years ago

你好,确实新版本的pytorch不支持小于0的scale值。我会处理这个问题!解决方案是取绝对值,希望可以解决你的问题。

# scale2[scale2 == 0] = 1e-6
Low_bound.apply(torch.abs(scale0), 1e-6)
KippQin commented 2 years ago

你好,如果在训练时,对于上述问题取绝对值操作,会对网络模型的有影响吗?

scale2[scale2 == 0] = 1e-6

Low_bound.apply(torch.abs(scale0), 1e-6)

sxd0071 commented 2 years ago

你好,如果在训练时,对于上述问题取绝对值操作,会对网络模型的有影响吗?

scale2[scale2 == 0] = 1e-6

Low_bound.apply(torch.abs(scale0), 1e-6)

我发现这样不能正常解码,不过还在查原因

tzayuan commented 1 year ago

@sxd0071 , 您是否解决了解码时的错误?

目前我有两个方案,1. 安装docker镜像;2. 修改scale部分的代码。其中,第2个方案,我担心解码会出现问题,请问您是否有一些新的建议?多谢