ClementPinard / SfmLearner-Pytorch

Pytorch version of SfmLearner from Tinghui Zhou et al.
MIT License
1.01k stars 226 forks source link

Transforming Depth Map #87

Closed soulslicer closed 4 years ago

soulslicer commented 4 years ago

Does this function work exactly the same if we want to transform a depthmap from one frame to another (instead of an RGB image?)

ClementPinard commented 4 years ago

You have two problems for depth warping, an easy one and a hard one :

  1. Contrary to pixel color, Depth values change when you move. Say you have a point p = (X, Y, Z), corresponding to a depth Z at pixel (u,v). If you move with R, T, the new point will be at Rp + T, and the new depth will be the last coordinate of this new point, at the new pixel location (u2,v2) (computed with the same warping algorithm as for images)
  2. Occlusion zones are very hard to filter : you will have "ghost zones" where the occluded depth will be the same as the depth that was occluding it in the first place. It's a common probnlem in inverse warping, and not filtering might result in even bigger warping problems after if you use these newly created depth maps to make other warpings, or if you use them to ensure depth consistency. I made somme comments about this problem in my PhD thesis last year that you can get here : https://pastel.archives-ouvertes.fr/tel-02285215/document (part 4.5, page 88)
soulslicer commented 4 years ago

Thanks for your input.

I. Yes you are right. I would need to then write my own warping function that converts my depth map into a point cloud, transform then project it to a new location right?

  1. So for the problems you describe, they will be there for rgb images too right? If say 2 pixels on the src image transforms to the same pixel in the destination image, you want to pick the one that is closer to the camera depth wise. Does your function handle this?
ClementPinard commented 4 years ago
  1. Yes, it's just an extra step to the already existing algorithm thought
  2. The problem is more or less treated with the explanability mask, but not very well. Turns out it's not that bad on images on kitti (could be worse on another dataset). But I can say from experience that warping depth is much more error prone especially if we use it for inverse warping. I encourage you to try it nevertheless, but prepare to have to filter the ghost areas somehow (you have a bunch of techniques on the report I gave you)
soulslicer commented 4 years ago
  1. If i was writing this operation as a for loop, this could easily be solved with an if else statement, so if a Z value is higher than what is already there, it is replaced (since objects in front will be what is physically visible). But it doesn't looks like this process can be written in a differentiable way. am i right to say this?
ClementPinard commented 4 years ago

You are talking about direct warping, which is the inverse of inverse warping. But yes you are right, this would work, but this would not be differentiable.

For differentiable direct warping, you might want to have a look at redner of pytroch3d : https://github.com/BachiLi/redner https://github.com/facebookresearch/pytorch3d

soulslicer commented 4 years ago

I am unable to get your function to work on depth maps, for some reason i always get artifacts like this:

image

The function works perfectly with the rgb images. But somehow fails when i pass in a sparse depth map. Any clue why this might be the case. This is very strange because the only part when the input image gets changed is in the grid_sample function.

i tried manually doing my own projection and everything looks correct, yet when applying this funciton i get alot of noise

EDIT: I have realized that this is happening because the depth we pass in inherently has holes in it (which shouldnt happen during training time)

ClementPinard commented 4 years ago

what was your depth like in the first place ? The ghosting effect sure shouldn't do that king of agglutination in front of the lens