nv-tlabs / lift-splat-shoot

Lift, Splat, Shoot: Encoding Images from Arbitrary Camera Rigs by Implicitly Unprojecting to 3D (ECCV 2020)
Other
1.04k stars 217 forks source link

voxel_pooling? final[geom_feats[:, 3], :, geom_feats[:, 2], geom_feats[:, 0], geom_feats[:, 1]] = x #29

Open liwuhen opened 2 years ago

liwuhen commented 2 years ago

final = torch.zeros((B, C, self.nx[2], self.nx[0], self.nx[1]), device=x.device) final[geom_feats[:, 3], :, geom_feats[:, 2], geom_feats[:, 0], geom_feats[:, 1]] = x

please!,why the value of final is zero? I'm confused by this assignment.

manueldiaz96 commented 2 years ago

Because you want an empty grid onto which project the camera features to. This way you get the structure ready to which you do the assignment of cells with respect to the info learned from the cameras.

If you see where the projection rays look to (look at the right-most figure), you will see that not all cells in the grid will be looked at by each pixel following the geometric projection for a low-res feature map.

liwuhen commented 2 years ago

thank you for your work!