EPFL-VILAB / omnidata

A Scalable Pipeline for Making Steerable Multi-Task Mid-Level Vision Datasets from 3D Scans [ICCV 2021]
Other
395 stars 49 forks source link

What is the coordinate for surface normal vector? #13

Closed jkhong99 closed 2 years ago

jkhong99 commented 2 years ago

Hello, Could you let us know which coordinate system the network using? I have checked the network output, but it seems the surface normal vector is not aligned with (left wall, right wall, floor). Also could you let me know how can I align your surface normal output (x, y, z) to (left wall, right wall, floor) vector?

alexsax commented 2 years ago

Hi @jkhong99 great question!

The surface normals in omnidata are encoded the same way as 2D3DS. The images are stored as RGB images with values between 0 and 1, that should be remapped to between -1 and 1. and a vector of (0,0,1) faces towards the camera. Here's the dataloader transform that I use to make things work with pytorch3d.

def transform_normal_cam():
  '''
     2D3DS space: +X right, +Y down, +Z from screen to me
     Pytorch3D space: +X left, +Y up, +Z from me to screen
  '''
  totensor = transforms.ToTensor()
  def _thunk(x):
    x2 = -(totensor(x) - 0.5) * 2.0
    x2[-1,...] *= -1
    return x2
  return _thunk
XinyaChen21 commented 1 year ago

why the z component of tensor x multiply -1 twice?

lcc815 commented 1 year ago

why the z component of tensor x multiply -1 twice?

@alexsax I have the same question. why the z component of tensor x multiply -1 twice? If I don't get you wrong, the +Z direction of 2D3DS and Pytorch3D are opposite, so we should only multiply -1 only once?

alexsax commented 10 months ago

Yes from the comment in the code snippet above you are correct, but I think the comment is simply wrong -- I visualized the camera locations and pointcloud using P3D here image