duanyiqun / DiffusionDepth

PyTorch Implementation of introducing diffusion approach to 3D depth perception ECCV 2024
https://arxiv.org/abs/2303.05021
Apache License 2.0
306 stars 17 forks source link

Question about model train #20

Closed ipwjincheng closed 10 months ago

ipwjincheng commented 1 year ago

Hi, I would like to ask whether the initial input xT of the network is obtained from random sampling or from depth gt plus noise during training

I noticed that you used a combination of loss in the code. Is it possible to restore pure noise through L1 and L2 loss? I really don't understand.

ipwjincheng commented 1 year ago

It looks like in your code, you're sampling a random noise and then passing discrete 0,50... The time series of 950 predicts the value at time x0. Add noise to the x0 value obtained at this time, and then perform the denoising process to get x '0. loss is calculated as l1 and l2loss of x0 and true depth and ddim loss of x0 and x '0. I don't know if I'm getting that right.

duanyiqun commented 1 year ago

Sorry for the late reply. I didn't fully understand the first question previously. But the process you described looks right. Please refer to the DDIM paper for more details. cheers