chenhsuanlin / signed-distance-SRN

SDF-SRN: Learning Signed Distance 3D Object Reconstruction from Static Images 🎯 (NeurIPS 2020)
MIT License
125 stars 17 forks source link

Further detail on ray_intersection_loss , and ray_freespace_loss #8

Closed albertotono closed 3 years ago

albertotono commented 3 years ago

Great work with the paper, it has been a pleasure to read and test the code. While I was perusing the code I wanted to better understand the loss functions that you used.

Specifically these ( Reference 7 in the paper and ray_intersection_loss , and ray_freespace_loss)

Screenshot from 2020-12-03 19-44-04

Coping the code below for further reference

def ray_intersection_loss(self,opt,var,level_eps=0.01):
        batch_size = len(var.idx)
        level_in = var.level_all[...,-1:] # [B,HW,1]
        weight = 1/(var.dt_input+1e-8) if opt.impl.importance else None
        if opt.impl.occup:
            loss = self.BCE_loss(level_in,var.mask_input,weight=weight)
        else:
            loss = self.L1_loss((level_in+level_eps).relu_(),weight=weight,mask=var.mask_input.bool()) \
                  +self.L1_loss((-level_in+level_eps).relu_(),weight=weight,mask=~var.mask_input.bool())
        return loss

    def ray_freespace_loss(self,opt,var,level_eps=0.01):
        level_out = var.level_all[...,:-1] # [B,HW,N-1]
        if opt.impl.occup:
            loss = self.BCE_loss(level_out,torch.tensor(0.,device=opt.device))
        else:
            loss = self.L1_loss((-level_out+level_eps).relu_())
        return loss

I am not able to fully understand the comparison with the formula in the paper,

  1. why do you add 1e-8? is it supposed to be the epsilon, I thought it was the level_eps that one?
  2. level_in and level_eps?

If you can expand on this topic it would be great

PS: why some shapes have that kind of wave function? Is it because of the eikonal? why did you use the MSE_loss there?

Thanks again for your help and availability in advance

chenhsuanlin commented 3 years ago

Sorry for the confusion. The 1e-8 term in the denominator is for numerical stability (avoiding the problem of division by zero). It's common practice so it wasn't explicitly mentioned in the paper. The epsilon in the paper corresponds to level_eps. level_in is the level set prediction of the last traced step (i.e. the last slice of var.level_all). Basically we want

This is the same as

Please also see Fig 2(b) in the paper -- these correspond to the red and green dots.

The wavy artifacts is believed to come from the positional encoding component, please see more in #3.

The MSE loss for the eikonal term is just something that tries to bring the gradient norm to 1. I guess other loss functions would serve similar purposes, but MSE loss is usually the default choice unless there are specific reasons not to use it.

Hope these help!

albertotono commented 3 years ago

Thanks for the quick and kind reply, Yes, it confirmed my assumptions, as mention in Siren.

"Further, we show how Sirens can be leveraged to solve challenging boundary value problems, such as particular Eikonal equations (yielding signed distance functions), the Poisson equation, and the Helmholtz and wave equations. Lastly, we combine Sirens with hypernetworks to learn priors over the space of Siren functions."

Thank you so much for pointing to that discussion.

Regarding the MSE I am confused since usually it brings the value to 0. Am I wrong?!

chenhsuanlin commented 3 years ago

It's basically supervising the prediction with the value 1 using MSE. Please see Equation 10 in the paper and also https://github.com/chenhsuanlin/signed-distance-SRN/blob/main/model/sdf_srn.py#L200 🙂

albertotono commented 3 years ago

Thank you for confirming my assumptions, much appreciated.