Closed pmcrodrigues closed 5 years ago
This is an extra feature that is not included in the original project, that is why it is not very well documented right now.
The way it is implemented right now is that P = K*H, where K is the intrinsic camera matrix and H is the projection matrix from world to camera coordinates. I have also some options for lens distortion coefficients that will default to 0.
I haven't tested all possible scenarios, so if you think there is a bug in the implementation, can you come up with a simple reproducible example where we can easily compute the ground truth (i.e. render a plane) to help me fix it?
I think there is a bug, because I also tried multiplying P with the intrinsic camera matrix, and the results are wrong. I'll try doing that in the following days.
I've performed a simple test with a cube and it seems to be ok, but flipped (didn't check the actual depth values, just the projection). However, for my *.obj it's not working.
import torch
import numpy as np
import neural_renderer as nr
import matplotlib.pyplot as plt
cuda0 = torch.device('cuda:0')
v = torch.tensor([[-1., -1., -1.],
[-1., -1., 1.],
[-1., 1., -1.],
[-1., 1., 1.],
[ 1., -1., -1.],
[ 1., -1., 1.],
[ 1., 1., -1.],
[ 1., 1., 1.]], device=cuda0)
f = torch.tensor([[ 0, 6, 4],
[ 0, 2, 6],
[ 0, 3, 2],
[ 0, 1, 3],
[ 2, 7, 6],
[ 2, 3, 7],
[ 4, 6, 7],
[ 4, 7, 5],
[ 0, 4, 5],
[ 0, 5, 1],
[ 1, 5, 7],
[ 1, 7, 3]], dtype=torch.int32, device=cuda0)
P = np.array([[ 6.237720e+02, -2.220966e+03, -9.922800e+01, 2.653734e+03],
[ 1.730178e+03, 3.091560e+02, -1.270062e+03, 1.904076e+03],
[ 7.797000e-01, -2.391000e-01, 5.787000e-01, 4.259900e+00]])
v = v*.5 # just because the original is way too large for this P
# project the vertices to compare
vs = np.pad(v.data.cpu().numpy(), ((0,0), (0,1)), mode='constant', constant_values=1)
vs_px = np.matmul(vs, P.transpose())
# neural renderer
t_P = torch.FloatTensor(np.expand_dims(P, 0)).cuda()
renderer = nr.Renderer(camera_mode='projection', P=t_P, image_size=1000)
im = renderer.render_depth(v[None, :, :], f[None, :, :])
#
plt.imshow(im.data.cpu().numpy()[0])
plt.scatter(vs_px[:, 0]/vs_px[:, 2], vs_px[:, 1]/vs_px[:, 2], s=200)
plt.show()
a = im.data.cpu().numpy()[0]
plt.imshow(np.flip(a, axis=0))
plt.scatter(vs_px[:, 0]/vs_px[:, 2], vs_px[:, 1]/vs_px[:, 2], s=200)
plt.show()
Is it possible to provide the code with your .obj file also? I am quite busy these few days but I can test everything later this week.
@nkolot the formula used in projection.py is obviously wrong, if K is the camera intrinsics, then P = K * H actually maps 3D points into pixel coordinates (i.e., in the range [0, 640] for example). You cannot apply distortion coefficients on this.
@nkolot I am trying to fix this and will try to do a pull request once it is done .
Cool, thanks. I haven't had much time lately to look at the issues.
@liupeidong88 Just letting you know that I fixed this issue today. I need to take care of some additional things and I will merge the changes.
I merged a fix. @pmcrodrigues can you check if you are getting the expected results now?
My example code is not working, and documentation is scarce. Can you help? By running the code below I cannot obtain the expected image. Also I am not sure were the camera intrinsics should enter in your formulation.