HengyiWang / Co-SLAM

[CVPR'23] Co-SLAM: Joint Coordinate and Sparse Parametric Encodings for Neural Real-Time SLAM
https://hengyiwang.github.io/projects/CoSLAM.html
Apache License 2.0
415 stars 37 forks source link

About smoothness term #31

Open shaoxiang777 opened 1 year ago

shaoxiang777 commented 1 year ago

Hi Hengyi,

thanks for your excellent job! About smooth term, I have following serveral questions.

  1. The explanation in the paper is not very clear.Can you explain more wht it works? Why we hope feature metric difference beween adjacent sampled vertices on the hash-grid trend to zero?Let's assume, one sample point is actual on the surface, its adjacent points in and out surface could be empty air. If we propose feature metric difference beween them is zero, this will destroy the reconstruction on this surface point. (I have replaced feature plane by hash grid in Eslam, and adding this smooth term on it with different weight. I find we will loss some reconstructed surface compared without using this term, especially on Scannet.)

  2. How do you estimate the bound of scene? I notice for Replica room0, the bound size is different in different papers. In nice-slam [[-2.9,8.9],[-3.2,5.5],[-3.5,3.3]], in eslam [[-1.9,7.9],[-2.2,4.5],[-2.5,2.3]], in coslam [[-1.0,7.0],[-1.3,3.7],[-1.7,1.4]].

The effect of smooth term will be influnced by the bound. The following show the result if I use bound size [[-1.9,7.9],[-2.2,4.5],[-2.5,2.3]]. These are some artifect out of room.

image

  1. In GoSurf you propose SDF gradient based smoothness term. Compared to that, what advantage does this feature metric smoothness term bring? By the way, why this term is only used in mapping process?

Thank you very much in advance!

HengyiWang commented 1 year ago

Hi @shaoxiang777, thank you for your interest in our work.

  1. The smoothness regularisation is mostly used to remove the floaters in unobserved regions and make the feature more compact. We usually set its weight to be quite small. Generally speaking, data terms v.s. regularisation terms are always like what you said. Also, please note that we use a joint encoding, the decoder would decode the coordinate encoding + parametric encoding, and regularisation is performed in feature space not directly on SDF. Thus, we did not observe that it affects the surface reconstruction in our experiments.
  2. The scene bound usually would not affect the reconstruction. There are two bounds, one for marching cubes, and one for representation. Can you check if you set them correctly? Also, You can set the margin to 0 in smoothness regularisation to remove the artifacts around the scene bound.
  3. Go-Surf regularises the surface normal, which requires second-order derivatives that are expensive for an online SLAM system, And it is for the smoothness of the surface. The regularisation is performed on the gradient of SDF instead of features. We did not use the regularisation term in tracking because the feature grid is not updated in tracking, and it may not make sense to involve a randomly sampled sub-grid in the tracking process.

Feel free to reach out if you have any further questions:)

shaoxiang777 commented 1 year ago

Hi @HengyiWang ,thank you for your quick response!

  1. Yes! I totally agree with you. The smoothness regularisation is performed in feature space not directly on SDF. But It will influence SDF indirectly. As you said in paper, coordinate encoding can bring hole-filling ability, which is a very important additional feature to SDF estimation I think.

The following two images show the result when I replace feature plane with hash grid in Eslam (without coordinate encoding). The left is result without using smoothness term. Right use smoothness term. I fiound right miss some resconstructed surfaces.

Thus, it explains why you did not observe that it affects the surface reconstruction in our experiments. The smooothness term will influence SDF surface reconstruction while coordinate encoding playing a important role to recovery it. Maybe it's a reseanable explainnation. 2.

 bound: [[-0.1,8.6],[-0.1,8.9],[-0.3,3.3]] 
 marching_cubes_bound: [[-0.1,8.6],[-0.1,8.9],[-0.3,3.3]]     # bound size in coslam
 bound: [[-2.0,11.0],[-2.0,11.5],[-2.0,5.5]]
 marching_cubes_bound: [[-2.0,11.0],[-2.0,11.5],[-2.0,5.5]]  # bound size in nice-slam

The left images show the result when I use smaller bound for Scannet Scene0000.When I enlarge the bound size in coslam while keeping all other setting are same, it shows some unknown artifects around the scene bound.

Can you point why it happens? or how did you choose the smaller bound size in coslam compared to nice-slam.

Thanka a lot in advance!

HengyiWang commented 1 year ago

Hi @shaoxiang777,

  1. Do you use global BA in Co-SLAM or local BA in ESLAM/NICE-SLAM? If you use local BA, then performing regularisation may result in this issue. I would suggest doing some experiments to see if this issue is caused by forgetting.
  2. I see what you mean. This is because when you set up a larger bound while keeping the voxel size of the hash grid, there would be floaters in unobserved regions caused by hash collision or the continuity of the representations. For instance, you can see the blue noisy points, which are probably because of the hash collision with the bed area. As we are doing the indoor scene reconstruction, we usually do not care about the floaters that are outside the inner surface of the room. However, if you really want to remove those floaters outside the actual inner surface, I would suggest you try tuning the size/weight of the smoothness term to see if it helps. Note that the tracking/surface reconstruction performance would not make much difference with respect to different (but reasonable) scene bounds.
shaoxiang777 commented 1 year ago

Hi @HengyiWang thank you for your suggestions and explanation! If I have some new found, I will give you feedback!