GuHuangAI / DiffusionEdge

Code for AAAI 2024 paper: "DiffusionEdge: Diffusion Probabilistic Model for Crisp Edge Detection"
Apache License 2.0
190 stars 18 forks source link

About norm operation of input #8

Closed iris0329 closed 5 months ago

iris0329 commented 5 months ago

Hi 你好,不好意思又打扰一下了,

VAE的input我看都是norm到了[-1, 1],但是output的输出范围又不是这样的,因为decoder的最后一层是torch.nn.Conv2d;而计算loss的时候,用的loss其中一个是

torch.abs(inputs.contiguous() - reconstructions.contiguous()) + \
          F.mse_loss(inputs, reconstructions, reduction="none")

这样不会导致 对应像素的值,即使预测正确,也对应不上吗?

还有就是请问 为什么要把input norm到[-1, 1]呢?

GuHuangAI commented 5 months ago

We refer to the previous work and normalize the input to [-1, 1], which can help the model training. I'm sorry that I'm a little confused about the question of loss function. In practice, the loss does not affect the training.

iris0329 commented 5 months ago

感谢您的回复,我看到LDM本身也是normalize the input to [-1, 1]。

关于loss我想表达的是,因为decoder的最后一层是torch.nn.Conv2d,所以reconstruction image的范围并不是 [-1, 1],可能是[2, 6]; 而计算loss 的时候,假设input 的 pixel (x1, y1)的值是 0.5,但是reconstruction image的 pixel (x1, y1)的值是 2;把 2 缩放之后得到的值可能还是 0.5,也就是说此时预测是正确的,但是计算torch.abs loss的时候还是会产生一个正值 1.5,这样会在backward的时候对网络产生惩罚,改变模型的权重,但是这种改变应该是不合理的。所以,是不是应该对 torch.nn.Conv2d 之后要加什么东西,normalize一下reconstruction image,使得它在 [-1, 1] 的范围内呢?

GuHuangAI commented 5 months ago

The loss function lets the reconstructed image close to the input, so we have not to add a sigmoid function at last.

iris0329 commented 5 months ago

Ah, I was confused about the setting, Sorry for the weird question thanks for your reply.

GuHuangAI commented 5 months ago

Ah, I was confused about the setting, Sorry for the weird question thanks for your reply.

I understand your question. You may think that the range of the model output is not the same as the input, but it's ok for the training. Hope can help you.