Open Chuan-10 opened 1 year ago
Hi, to support unbounded 360 scenes, we need to warp the world coordinates to a bounded range. One example is the contraction operation from MipNerf 360:
# contraction from mipnerf implementation
def contract(x):
"""
Contracts points towards the origin (Eq 10 of arxiv.org/abs/2111.12077).
Args:
x: A tensor of shape [N, 3].
"""
eps = torch.tensor(1e-8)
# Clamping to eps prevents non-finite gradients when x == 0.
x_mag_sq = torch.maximum(eps, torch.sum(x**2, dim=-1, keepdim=True)) # [N, 1]
z = torch.where(x_mag_sq <= 1, x, ((2 * torch.sqrt(x_mag_sq) - 1) / x_mag_sq) * x) # [N, 3]
return z
which maps the world coordinates to [-2, 2].
The sampling of the world coordinates can be implemented as:
def sample_ray_contracted(self, rays_o, rays_d, is_train=True, N_samples=-1):
'''
We do not perform contraction here
'''
N_samples = N_samples if N_samples > 0 else self.nSamples
near, far = self.near_far
inner_N_samples = N_samples - N_samples // 2
outer_N_samples = N_samples // 2
# inner
interpx_inner = (
torch.linspace(near, 2.0, inner_N_samples + 1).unsqueeze(0).to(rays_o)
)
if is_train:
interpx_inner[:, :-1] += (
torch.rand_like(interpx_inner).to(rays_o)
* ((2.0 - near) / inner_N_samples)
)[:, :-1]
interpx_inner = (interpx_inner[:, 1:] + interpx_inner[:, :-1]) * 0.5
# sample outer
rng = torch.arange(outer_N_samples + 1)[None].float()
if is_train:
rng[:, :-1] += (torch.rand_like(rng).to(rng))[:, :-1]
rng = torch.flip(rng, [1])
rng = (rng[:, 1:] + rng[:, :-1]) * 0.5
interpx_outer = 1.0 / (
1 / (far) + (1 / 2.0 - 1 / (far)) * rng / outer_N_samples
).to(rays_o.device)
interpx = torch.cat((interpx_inner, interpx_outer), -1)
rays_pts = rays_o[..., None, :] + rays_d[..., None, :] * interpx[..., None]
mask_outbbox = torch.zeros_like(rays_pts[..., 0]) > 0 # every ray is valid
return rays_pts, interpx, ~mask_outbbox
I haven't tested the performance in the unbounded scenes though, but I think one problem is the complex background which does not have a suitable text prompt to describe.
Thank you for the instructions! I will try it later and feed back. Besides, I have a little question that how to manually annotate the segments of test images, is there any tools or methods? Because it is an open vocabulary problem, how to choose the right prompts to segment? Thank you for the time!
Hi, sorry for bothering you. I wanna ask how to render test images in LERF, I noticed that you compared the LERF results in your paper. Can you give some advice?
Hi, to support unbounded 360 scenes, we need to warp the world coordinates to a bounded range. One example is the contraction operation from MipNerf 360:
# contraction from mipnerf implementation def contract(x): """ Contracts points towards the origin (Eq 10 of arxiv.org/abs/2111.12077). Args: x: A tensor of shape [N, 3]. """ eps = torch.tensor(1e-8) # Clamping to eps prevents non-finite gradients when x == 0. x_mag_sq = torch.maximum(eps, torch.sum(x**2, dim=-1, keepdim=True)) # [N, 1] z = torch.where(x_mag_sq <= 1, x, ((2 * torch.sqrt(x_mag_sq) - 1) / x_mag_sq) * x) # [N, 3] return z
which maps the world coordinates to [-2, 2].
The sampling of the world coordinates can be implemented as:
def sample_ray_contracted(self, rays_o, rays_d, is_train=True, N_samples=-1): ''' We do not perform contraction here ''' N_samples = N_samples if N_samples > 0 else self.nSamples near, far = self.near_far inner_N_samples = N_samples - N_samples // 2 outer_N_samples = N_samples // 2 # inner interpx_inner = ( torch.linspace(near, 2.0, inner_N_samples + 1).unsqueeze(0).to(rays_o) ) if is_train: interpx_inner[:, :-1] += ( torch.rand_like(interpx_inner).to(rays_o) * ((2.0 - near) / inner_N_samples) )[:, :-1] interpx_inner = (interpx_inner[:, 1:] + interpx_inner[:, :-1]) * 0.5 # sample outer rng = torch.arange(outer_N_samples + 1)[None].float() if is_train: rng[:, :-1] += (torch.rand_like(rng).to(rng))[:, :-1] rng = torch.flip(rng, [1]) rng = (rng[:, 1:] + rng[:, :-1]) * 0.5 interpx_outer = 1.0 / ( 1 / (far) + (1 / 2.0 - 1 / (far)) * rng / outer_N_samples ).to(rays_o.device) interpx = torch.cat((interpx_inner, interpx_outer), -1) rays_pts = rays_o[..., None, :] + rays_d[..., None, :] * interpx[..., None] mask_outbbox = torch.zeros_like(rays_pts[..., 0]) > 0 # every ray is valid return rays_pts, interpx, ~mask_outbbox
I haven't tested the performance in the unbounded scenes though, but I think one problem is the complex background which does not have a suitable text prompt to describe.
Hi, I have tried the code, but I didn't get the right results, the train_psnr and mse were nan always. Can you give some more detailed instructions? Now I just modified it like this, and the dataset is 360v2.
# In class TensorBase
def forward(self, rays_chunk, white_bg=True, is_train=False, ndc_ray=False, N_samples=-1):
# sample points
viewdirs = rays_chunk[:, 3:6]
if ndc_ray:
if self.is_360:
xyz_sampled, z_vals, ray_valid = self.sample_ray_contracted(rays_chunk[:, :3], viewdirs, is_train=is_train,N_samples=N_samples)
xyz_sampled = self.contract(xyz_sampled)
else:
xyz_sampled, z_vals, ray_valid = self.sample_ray_ndc(rays_chunk[:, :3], viewdirs, is_train=is_train,N_samples=N_samples)
dists = torch.cat((z_vals[:, 1:] - z_vals[:, :-1], torch.zeros_like(z_vals[:, :1])), dim=-1)
rays_norm = torch.norm(viewdirs, dim=-1, keepdim=True)
dists = dists * rays_norm
viewdirs = viewdirs / rays_norm
else:
xyz_sampled, z_vals, ray_valid = self.sample_ray(rays_chunk[:, :3], viewdirs, is_train=is_train,N_samples=N_samples)
dists = torch.cat((z_vals[:, 1:] - z_vals[:, :-1], torch.zeros_like(z_vals[:, :1])), dim=-1)
viewdirs = viewdirs.view(-1, 1, 3).expand(xyz_sampled.shape)
....
if ray_valid.any():
if self.is_360:
xyz_sampled = (xyz_sampled + 2) / 4
else:
xyz_sampled = self.normalize_coord(xyz_sampled)
sigma_feature = self.compute_densityfeature(xyz_sampled[ray_valid])
validsigma = self.feature2density(sigma_feature)
sigma[ray_valid] = validsigma
Hi, I think you should also use self.normalize_coord(xyz_sampled)
when using 360 datasets. The nan may be due to the sampling coordinates being out of [-1,1].
For the segmentation annotation tools, you can refer to this link. For the prompt engineering, you can take a look at this section.
Thank you very much!
I have followed your advice to use self.normalize_coord(xyz_sampled)
, but i got nan again.
My dataset is 360 v2, and i directly use the poses_bounds.npy in it. I used the llff format to read the data. And I found if using 360 dataset, the rays_norm
and the dist
is very large, like 1k. And I got the nan value from the first running of alpha, weight, bg_weight = raw2alpha(sigma, dists * self.distance_scale)
, specificly the step T = torch.cumprod(torch.cat([torch.ones(alpha.shape[0], 1).to(alpha.device), 1. - alpha + 1e-10], -1), -1)
in the function raw2alpha
.
I found the there were nagetive values in dist
, so the there were nagetive values in alpha
too, which might make T
have nan values.
Could you give me some instructions?
Thank you for the excellent work! I noticed that there is a support for unbouded 360 scenes in TODOs, and when will it be finished? And if the support won't come quickly, could you give some tips about the support? There are some code slides will be better.