duanyiqun / DiffusionDepth

PyTorch Implementation of introducing diffusion approach to 3D depth perception ECCV 2024
https://arxiv.org/abs/2303.05021
Apache License 2.0
306 stars 17 forks source link

About depth mask in training #8

Closed HencyChen closed 1 year ago

HencyChen commented 1 year ago

Hi @erjanmx and @duanyiqun ,

Thanks for the great work, it's fantastic to see work that combine current popular diffusion models with depth estimation.

I currently desire to reproduce your results, however, when I trace the code on depth mask, I see the depth mask is comment out in the code (in the model architecture of swin transformer and its corresponded head (HAHI), as well as the loss is not computed on mask regions but all regions).

Isn't depth mask necessary during training? Otherwise, how to add noise on GT depth maps in the diffusion process?

Thanks again!

duanyiqun commented 1 year ago

hi there, At an earlier stage, I used to put depth masks on both latent and depth space. The comment-out part should be latent space. As we mentioned in another issue, we found the double constraint is not necessary. Only putting a mask in-depth space is ok. In that case, you can see the mask while computing the loss takes effect only. About adding the noise part, as we mentioned in the paper. The noise is added to the refined depth map itself, instead of GT-depth.

HencyChen commented 1 year ago

I see. Thanks for your reply.

lhiceu commented 1 year ago

hi there, At an earlier stage, I used to put depth masks on both latent and depth space. The comment-out part should be latent space. As we mentioned in another issue, we found the double constraint is not necessary. Only putting a mask in-depth space is ok. In that case, you can see the mask while computing the loss takes effect only. About adding the noise part, as we mentioned in the paper. The noise is added to the refined depth map itself, instead of GT-depth.

Hi @duanyiqun ! I am still confused with the problem about depth_mask. I traced the code in 'Diffusion_DCbase_Loss' class and 'DDIMDepthEstimate_Swin_ADDHAHI' class but found that depth_mask is not used in ddim_loss, L1 loss and L2 loss. Could you point out where depth_mask takes effect?