duanyiqun / DiffusionDepth

PyTorch Implementation of introducing diffusion approach to 3D depth perception
https://arxiv.org/abs/2303.05021
Apache License 2.0
293 stars 16 forks source link

Question about DDIM loss #26

Open zyp-byte opened 1 year ago

zyp-byte commented 1 year ago

Hi, thanks for your wonderful work! When I was reading the code, I noticed that you took the time embedding on the feature extracted from the RGB images. I am wondering if it is better to take the time embedding on the depth output by the decoder (namely 'refined_depth' defined in your code), or just annotated depth with masks. Thanks for your work and codes again!

duanyiqun commented 1 year ago

Hi, thanks for your question. At that time our thinking is more like to add the time embedding on a dense and consistent feature. This way is closer to the original diffusion model. I haven't try to put time embedding directly on depth map. Have you got any attempts on that? If the results are positive, I'm keen to have a improved version with you.

VLadImirluren commented 12 months ago

Hi, thanks for your question. At that time our thinking is more like to add the time embedding on a dense and consistent feature. This way is closer to the original diffusion model. I haven't try to put time embedding directly on depth map. Have you got any attempts on that? If the results are positive, I'm keen to have a improved version with you.

Hi ! Thanks to your nice job. I notive that you choose to predict the x0 instead of noise like DDPM. Can you share the reason with me? Thanks again~