关于mse损失 - Githubissues

YiyanXu / DiffRec

Diffusion Recommender Model

168 stars 23 forks source link

关于mse损失 #20

Closed akajinchen closed 2 months ago

akajinchen commented 2 months ago

https://github.com/YiyanXu/DiffRec/blob/d605bc9178f338f2f16a084367d859d72ff0608d/L-DiffRec/models/gaussian_diffusion.py#L140 这里的mse损失，target是否也会被传梯度更新呢？我们的目的应该是吧output与target逼近，但是这里如果也给target传梯度会不会产生影响呢？

YiyanXu commented 2 months ago

target中不含可学习的参数，所以不会有影响的。

akajinchen commented 2 months ago

target中不含可学习的参数，所以不会有影响的。

这里的target不是latent_recon吗

YiyanXu commented 2 months ago

target = { ModelMeanType.START_X: x_start, ModelMeanType.EPSILON: noise, }[self.mean_type]

akajinchen commented 2 months ago

target = { ModelMeanType.START_X: x_start, ModelMeanType.EPSILON: noise, }[self.mean_type] 感谢您的回复，但我注意到在main中的，x_start是batch_latent，这样会不会导致mse再更新梯度的时候也更新了target的参数呢，如果我有没有理解到的地方非常感谢您指出，因为这困惑了好久，感谢 batch_cate, batch_latent, vae_kl = Autoencoder.Encode(batch) terms = diffusion.training_losses(model, batch_latent, args.reweight) elbo = terms["loss"].mean() # loss from diffusion batch_latent_recon = terms["pred_xstart"] batch_recon = Autoencoder.Decode(batch_latent_recon)

YiyanXu commented 2 months ago

抱歉之前误解了你的问题，在L-DiffRec中确实x_start确实包含可学习的参数，且它和model_output中包含的可学习参数是有重叠的，这可能会导致模型训练不稳定。感谢指出这个问题，最好的策略应该是先单独训练autoencoder，再固定autoencoder的参数单独训练diffusion部分。

akajinchen commented 2 months ago

抱歉之前误解了你的问题，在L-DiffRec中确实x_start确实包含可学习的参数，且它和model_output中包含的可学习参数是有重叠的，这可能会导致模型训练不稳定。感谢指出这个问题，最好的策略应该是先单独训练autoencoder，再固定autoencoder的参数单独训练diffusion部分。

感谢您的回复，这解决了我的困惑