Closed iris0329 closed 5 months ago
We refer to the previous work and normalize the input to [-1, 1], which can help the model training. I'm sorry that I'm a little confused about the question of loss function. In practice, the loss does not affect the training.
感谢您的回复,我看到LDM本身也是normalize the input to [-1, 1]。
关于loss我想表达的是,因为decoder的最后一层是torch.nn.Conv2d,所以reconstruction image的范围并不是 [-1, 1],可能是[2, 6]; 而计算loss 的时候,假设input 的 pixel (x1, y1)的值是 0.5,但是reconstruction image的 pixel (x1, y1)的值是 2;把 2 缩放之后得到的值可能还是 0.5,也就是说此时预测是正确的,但是计算torch.abs loss的时候还是会产生一个正值 1.5,这样会在backward的时候对网络产生惩罚,改变模型的权重,但是这种改变应该是不合理的。所以,是不是应该对 torch.nn.Conv2d 之后要加什么东西,normalize一下reconstruction image,使得它在 [-1, 1] 的范围内呢?
The loss function lets the reconstructed image close to the input, so we have not to add a sigmoid function at last.
Ah, I was confused about the setting, Sorry for the weird question thanks for your reply.
Ah, I was confused about the setting, Sorry for the weird question thanks for your reply.
I understand your question. You may think that the range of the model output is not the same as the input, but it's ok for the training. Hope can help you.
Hi 你好,不好意思又打扰一下了,
VAE的input我看都是norm到了[-1, 1],但是output的输出范围又不是这样的,因为decoder的最后一层是torch.nn.Conv2d;而计算loss的时候,用的loss其中一个是
这样不会导致 对应像素的值,即使预测正确,也对应不上吗?
还有就是请问 为什么要把input norm到[-1, 1]呢?