nex-mpi / nex-code

Code release for NeX: Real-time View Synthesis with Neural Basis Expansion
MIT License
597 stars 73 forks source link

Enquiry related to the normalization #14

Open derrick-xwp opened 3 years ago

derrick-xwp commented 3 years ago

Dear author,

After reading your paper, the experience is enriching as well as being overwhelmed.

I have some confusion regarding your code, and would appreciate if you can explain.

In the poses_avg, it calculates the new center of the world, and get a normalized vector, Vec2.

THen in viewmatrix function, it normalizes the z (Vec2) for a second time, and then vec0 and vec1 is cross producted and normalized.

Why are the above cross product processes necessary? could we simply normalize them without cross product?

Thanks

def viewmatrix(z, up, pos): vec2 = normalize(z) vec1_avg = up vec0 = normalize(np.cross(vec1_avg, vec2)) vec1 = normalize(np.cross(vec2, vec0)) m = np.stack([vec0, vec1, vec2, pos], 1) return m

def poses_avg(poses):

poses [images, 3, 4] not [images, 3, 5]

hwf = poses[0, :3, -1:],

center = poses[:, :3, 3].mean(0)
vec2 = normalize(poses[:, :3, 2].sum(0)) up = poses[:, :3, 1].sum(0) c2w = np.concatenate([viewmatrix(vec2, up, center)], 1)

derrick-xwp commented 3 years ago

Hi,

I see that this part of code is calculating the new w2c matrix based on the new world coordinate system (in the center of the scene).

But what is the geometry principles used here?

THanks

pureexe commented 3 years ago

The view matrix code is taken from LLFF. So, I'm not sure is it necessary to cross product.

Here is an original code https://github.com/Fyusion/LLFF/blob/c6e27b1ee59cb18f054ccb0f87a90214dbe70482/llff/math/pose_math.py#L14

for calculating the new w2c matrix based on the new world coordinate system. can you point me to the code? I'm not sure which part are you mentioning.

derrick-xwp commented 3 years ago

How much time will the training take from scratch?

pureexe commented 3 years ago

It depends on the number of input images.

Fern (17 training images) takes about 18 hours while CD (265 training images) takes about 4 days.