cfernandezlab / CFL

Tensorflow implementation of our end-to-end model to recover 3D layouts. Also with equirectangular convolutions!
GNU General Public License v3.0
105 stars 18 forks source link

How to understand the use of K and inv_K? #12

Closed FocusK closed 4 years ago

FocusK commented 4 years ago

Hello! Thanks for your great work. When I read your code, I have some confusion. I have seen the formula(3) and (5) in 4.1 EquiConvsDetails, and you define a matrix K = np.array([ [focal, 0, c_x], [0, focal, c_y], [0,0,1] ]) and rays = np.matmul(inv_K, rays.reshape(3, r_h*r_w)). I guess your aim is to put the coordinates on the sphere? But I don't understand the geometric principle behind. Can you introduce it to me? Thank you very much!

cfernandezlab commented 4 years ago

Hello!

The only thing we are doing there is to define the kernel on the unit sphere.

Imagine that you place the standard kernel on the surface of the sphere, leaving it tangent to the sphere. The kernel center location on the tangent plane is at r/2 (spatial resolution). If instead we adapt the kernel to the surface of the sphere, the kernel center location is at alpha/2 (angular resolution). Taking this into account, we can calculate the distance between the center of the sphere (0,0), and the kernel center location. Then we use that information to define the matrix K, that can be seen as the spherical camera parameters.

By multilpying the kernel grid with the matrix K, we get the kernel rays, i.e. the rays that go from the center of the sphere to each kernel location on the surface of the sphere.

FocusK commented 4 years ago

Thank you! I understand through your explanation.