duanyiqun / DiffusionDepth

PyTorch Implementation of introducing diffusion approach to 3D depth perception
https://arxiv.org/abs/2303.05021
Apache License 2.0
293 stars 16 forks source link

About loss function #28

Closed HencyChen closed 1 year ago

HencyChen commented 1 year ago

Hi @erjanmx and @duanyiqun,

Thanks for the great work.

Based on the paper, you mentioned the loss function is composited by a pixel loss, a latent loss and regular DDIM loss. However, in the code I only find DDIM loss and L1/L2 loss on predicted depth and GT. I'm wondering if I misunderstanding something or I just misfinding the corresponded loss function.

Thanks again !

duanyiqun commented 1 year ago

Hi Chen, After we posterior experiments, using the latent loss have small improvement but may cause severe training weights crash if without pre-training. So we just maintain pixel loss and DDIM loss for public. Pixel loss is enough to supervise the latent space limited to our experiment. Best regards

HencyChen commented 1 year ago

Hi @duanyiqun,

Thanks for the reply. In this case, what do you mean "without pre-training"? Do you use any model as the pre-trained weight?

duanyiqun commented 1 year ago

We use Swin transformer large 384 as backbone pre-training. But here pre-training refers to depth pre-training using the pixel loss.

HencyChen commented 1 year ago

We've seen swin-large as pretrained weight in the code. But what do you mean to use "depth pre-trainining" using pixel loss? Based on the description of paper, I didn't find any description about depth pre-training or I miss something?

Thanks!

duanyiqun commented 1 year ago

It just means train this model with the current loss. Then add the latent loss after roughly training the diffusion depth model.

HencyChen commented 1 year ago

Got it. Thanks for the quick reply^^