bmild / nerf

Code release for NeRF (Neural Radiance Fields)
http://tancik.com/nerf
MIT License
9.6k stars 1.34k forks source link

about render_path_spiral and viewmatrix #82

Closed ai1361720220000 closed 3 years ago

ai1361720220000 commented 3 years ago

hello, i'm a novice in this field. I was confused about two functions below, they seem to perform transformation between coordinates, could you give more details?

def render_path_spiral(c2w, up, rads, focal, zdelta, zrate, rots, N):
    render_poses = []
    rads = np.array(list(rads) + [1.])
    hwf = c2w[:,4:5]
    for theta in np.linspace(0., 2. * np.pi * rots, N+1)[:-1]:
        c = np.dot(c2w[:3,:4], np.array([np.cos(theta), -np.sin(theta), -np.sin(theta*zrate), 1.]) * rads) 
        z = normalize(c - np.dot(c2w[:3,:4], np.array([0,0,-focal, 1.])))
        render_poses.append(np.concatenate([viewmatrix(z, up, c), hwf], 1))
    return render_poses
def viewmatrix(z, up, pos):
    vec2 = normalize(z)
    vec1_avg = up
    vec0 = normalize(np.cross(vec1_avg, vec2)) #np.cross叉积
    vec1 = normalize(np.cross(vec2, vec0))
    m = np.stack([vec0, vec1, vec2, pos], 1)
    return m

hope you can help me. Thanks!

kwea123 commented 3 years ago

24 might help for the second question.

For the first question it is somehow complicated to explain, you can see my implementation where I have added comments: the first function: https://github.com/kwea123/nerf_pl/blob/master/datasets/llff.py#L83 the second function: https://github.com/kwea123/nerf_pl/blob/master/datasets/llff.py#L17

This is my personal understanding but I'm quite sure.

ai1361720220000 commented 3 years ago

24 might help for the second question.

For the first question it is somehow complicated to explain, you can see my implementation where I have added comments: the first function: https://github.com/kwea123/nerf_pl/blob/master/datasets/llff.py#L83 the second function: https://github.com/kwea123/nerf_pl/blob/master/datasets/llff.py#L17

This is my personal understanding but I'm quite sure.

Thanks, i will read your implementation carefully. Also, i want to know the meaning of each column in poses produced from 'poses_bounds.npy' by colmap.

    poses_arr = np.load(os.path.join(basedir, 'poses_bounds.npy'))
    poses = poses_arr[:, :-2].reshape([-1, 3, 5]).transpose([1,2,0])
ai1361720220000 commented 3 years ago

24 might help for the second question.

For the first question it is somehow complicated to explain, you can see my implementation where I have added comments: the first function: https://github.com/kwea123/nerf_pl/blob/master/datasets/llff.py#L83 the second function: https://github.com/kwea123/nerf_pl/blob/master/datasets/llff.py#L17

This is my personal understanding but I'm quite sure.

i have read your average_poses() but it is a little different from the origin project. The author conduct sum operation instead of average about vec2, up

def poses_avg(poses):
    hwf = poses[0, :3, -1:]
    center = poses[:, :3, 3].mean(0) 
    vec2 = normalize(poses[:, :3, 2].sum(0)) 
    up = poses[:, :3, 1].sum(0)
    c2w = np.concatenate([viewmatrix(vec2, up, center), hwf], 1) 
    return c2w
ai1361720220000 commented 3 years ago

24 might help for the second question.

For the first question it is somehow complicated to explain, you can see my implementation where I have added comments: the first function: https://github.com/kwea123/nerf_pl/blob/master/datasets/llff.py#L83 the second function: https://github.com/kwea123/nerf_pl/blob/master/datasets/llff.py#L17

This is my personal understanding but I'm quite sure.

another question. why pose_avg_homo have to conduct inv? poses_centered = np.linalg.inv(pose_avg_homo) @ poses_homo

kwea123 commented 3 years ago

the definition of poses_bounds.npy is here in one word, it contains the 3x4 camera to world matrix, 3x1 hwf (height width focal of the cameras) and 2x1 scene bounds (nearest and farthest points for each image)

sum and average are the same because they are normalized afterwards.

Creating the homogeneous version is just to simplify the computation, otherwise you need to extract R and t and do many careful multiplications.

ai1361720220000 commented 3 years ago

Thanks for your reply. i have found the answers in LLFF preject.

LZL-CS commented 1 year ago

24 might help for the second question.

For the first question it is somehow complicated to explain, you can see my implementation where I have added comments: the first function: https://github.com/kwea123/nerf_pl/blob/master/datasets/llff.py#L83 the second function: https://github.com/kwea123/nerf_pl/blob/master/datasets/llff.py#L17 This is my personal understanding but I'm quite sure.

another question. why pose_avg_homo have to conduct inv? poses_centered = np.linalg.inv(pose_avg_homo) @ poses_homo

Hi, I have the same question about: poses_centered = np.linalg.inv(pose_avg_homo) @ poses_homo . Do you know the derivation of poses_centered? Thanks in advance!