Closed Hymwgk closed 3 months ago
I_s(ξ)
is the Shannon's entropy. In our case, we intuitively understand it as the amount of information that is still unknown about the voxels that would be visible from a viewpoint, and hence it is the information that could be gained from the viewpoint if the camera was moved there. ray_points_nor = self.normalize_3d_coordinate(ray_points)
ray_points_nor = ray_points_nor.view(1, -1, 1, 1, 3).repeat(3, 1, 1, 1, 1)
# Sample the occupancy probabilities and semantic confidences along each ray
grid = self.voxel_grid[None, ..., 0:3].permute(4, 0, 1, 2, 3)
occ_sem_confs = F.grid_sample(grid, ray_points_nor, align_corners=True)
occ_sem_confs = occ_sem_confs.view(3, -1, self.num_pts_per_ray)
occ_sem_confs = occ_sem_confs.clamp(self.eps, 1.0 - self.eps)
# Compute the entropy of the semantic confidences along each ray
opacities = torch.sigmoid(1e7 * (occ_sem_confs[1, ...] - 0.51))
transmittance = self.shifted_cumprod(1.0 - opacities)
ray_gains = (
transmittance * self.entropy(occ_sem_confs[2, ...]) * occ_sem_confs[0, ...]
)
The above lines interpolate the ROI values along the sampled points and then multiply them with the expected gain values. This should effectively make all the points that are not within the ROI zero. However, I have not tested how effective these changes are. Please check.
Thanks again : )
I have some questions regarding the description of "expected semantic information gain" in the paper:
Thank you in advance for you help.