Closed jkhong99 closed 2 years ago
Hi @jkhong99 great question!
The surface normals in omnidata are encoded the same way as 2D3DS. The images are stored as RGB images with values between 0 and 1, that should be remapped to between -1 and 1. and a vector of (0,0,1) faces towards the camera. Here's the dataloader transform that I use to make things work with pytorch3d.
def transform_normal_cam():
'''
2D3DS space: +X right, +Y down, +Z from screen to me
Pytorch3D space: +X left, +Y up, +Z from me to screen
'''
totensor = transforms.ToTensor()
def _thunk(x):
x2 = -(totensor(x) - 0.5) * 2.0
x2[-1,...] *= -1
return x2
return _thunk
why the z component of tensor x multiply -1 twice?
why the z component of tensor x multiply -1 twice?
@alexsax I have the same question. why the z component of tensor x multiply -1 twice? If I don't get you wrong, the +Z direction of 2D3DS and Pytorch3D are opposite, so we should only multiply -1 only once?
Hello, Could you let us know which coordinate system the network using? I have checked the network output, but it seems the surface normal vector is not aligned with (left wall, right wall, floor). Also could you let me know how can I align your surface normal output (x, y, z) to (left wall, right wall, floor) vector?