About norm operation of input

GuHuangAI / DiffusionEdge

Code for AAAI 2024 paper: "DiffusionEdge: Diffusion Probabilistic Model for Crisp Edge Detection"

Apache License 2.0

190 stars 18 forks source link

About norm operation of input #8

Closed iris0329 closed 5 months ago

iris0329 commented 5 months ago

Hi 你好，不好意思又打扰一下了，

VAE的input我看都是norm到了[-1, 1]，但是output的输出范围又不是这样的，因为decoder的最后一层是torch.nn.Conv2d；而计算loss的时候，用的loss其中一个是

torch.abs(inputs.contiguous() - reconstructions.contiguous()) + \
          F.mse_loss(inputs, reconstructions, reduction="none")

这样不会导致对应像素的值，即使预测正确，也对应不上吗？

还有就是请问为什么要把input norm到[-1, 1]呢？

GuHuangAI commented 5 months ago

We refer to the previous work and normalize the input to [-1, 1], which can help the model training. I'm sorry that I'm a little confused about the question of loss function. In practice, the loss does not affect the training.

iris0329 commented 5 months ago

感谢您的回复，我看到LDM本身也是normalize the input to [-1, 1]。

关于loss我想表达的是，因为decoder的最后一层是torch.nn.Conv2d，所以reconstruction image的范围并不是 [-1, 1]，可能是[2, 6]; 而计算loss 的时候，假设input 的 pixel (x1, y1)的值是 0.5，但是reconstruction image的 pixel (x1, y1)的值是 2；把 2 缩放之后得到的值可能还是 0.5，也就是说此时预测是正确的，但是计算torch.abs loss的时候还是会产生一个正值 1.5，这样会在backward的时候对网络产生惩罚，改变模型的权重，但是这种改变应该是不合理的。所以，是不是应该对 torch.nn.Conv2d 之后要加什么东西，normalize一下reconstruction image，使得它在 [-1, 1] 的范围内呢？

GuHuangAI commented 5 months ago

The loss function lets the reconstructed image close to the input, so we have not to add a sigmoid function at last.

iris0329 commented 5 months ago

Ah, I was confused about the setting, Sorry for the weird question thanks for your reply.

GuHuangAI commented 5 months ago

Ah, I was confused about the setting, Sorry for the weird question thanks for your reply.

I understand your question. You may think that the range of the model output is not the same as the input, but it's ok for the training. Hope can help you.