TRAILab / CaDDN

Categorical Depth Distribution Network for Monocular 3D Object Detection (CVPR 2021 Oral)
Apache License 2.0
359 stars 62 forks source link

How do you visualize the BEV feature as in figure 1 of your paper? #100

Closed Jensen-Su closed 2 years ago

Jensen-Su commented 2 years ago

I am working based on your method, but currently it did not go well as expected. To do debugging, I would like to check the step of converting img_feat to bev_feat via visualization. So how do you visualize the bev_feat as in the following figure of your paper? image

xuyanging commented 2 years ago

Rewrite Conv2DCollapse and add following code: ' def forward(self, batch_dict): """ Collapses voxel features to BEV via concatenation and channel reduction Args: batch_dict: voxel_features [torch.Tensor(B, C, Z, Y, X)]: Voxel feature representation Returns: batch_dict: spatial_features [torch.Tensor(B, C, Y, X)]: BEV feature representation """ voxel_features = batch_dict["voxel_features"] # [bs, 64, 25, 376, 280] bev_features = voxel_features.flatten(start_dim=1, end_dim=2) # (B, C, Z, Y, X) -> (B, CZ, Y, X) bev_features = self.block(bev_features) # (B, CZ, Y, X) -> (B, C, Y, X) import cv2 bev_1 = bev_features[0][0].cpu().detach().numpy() for i in range(bev_features[0].shape[0]): bev = bevfeatures[0][i].cpu().detach().numpy() cv2.imwrite('test'+str(i)+'.jpg', bev*255/bev.max()) batch_dict["spatial_features"] = bev_features return batch_dict '

Jensen-Su commented 2 years ago

Rewrite Conv2DCollapse and add following code: ' def forward(self, batch_dict): """ Collapses voxel features to BEV via concatenation and channel reduction Args: batch_dict: voxel_features [torch.Tensor(B, C, Z, Y, X)]: Voxel feature representation Returns: batch_dict: spatial_features [torch.Tensor(B, C, Y, X)]: BEV feature representation """ voxel_features = batch_dict["voxel_features"] # [bs, 64, 25, 376, 280] bev_features = voxel_features.flatten(start_dim=1, end_dim=2) # (B, C, Z, Y, X) -> (B, C_Z, Y, X) bev_features = self.block(bev_features) # (B, C_Z, Y, X) -> (B, C, Y, X) import cv2 bev_1 = bev_features[0][0].cpu().detach().numpy() for i in range(bev_features[0].shape[0]): bev = bevfeatures[0][i].cpu().detach().numpy() cv2.imwrite('test'+str(i)+'.jpg', bev*255/bev.max()) batch_dict["spatial_features"] = bev_features return batch_dict '

Your meaning, just to visualize each channel of the collapsed feature map (B, C, Y, X)

xuyanging commented 2 years ago

Rewrite Conv2DCollapse and add following code: ' def forward(self, batch_dict): """ Collapses voxel features to BEV via concatenation and channel reduction Args: batch_dict: voxel_features [torch.Tensor(B, C, Z, Y, X)]: Voxel feature representation Returns: batch_dict: spatial_features [torch.Tensor(B, C, Y, X)]: BEV feature representation """ voxel_features = batch_dict["voxel_features"] # [bs, 64, 25, 376, 280] bev_features = voxel_features.flatten(start_dim=1, end_dim=2) # (B, C, Z, Y, X) -> (B, C_Z, Y, X) bev_features = self.block(bev_features) # (B, C_Z, Y, X) -> (B, C, Y, X) import cv2 bev_1 = bev_features[0][0].cpu().detach().numpy() for i in range(bev_features[0].shape[0]): bev = bevfeatures[0][i].cpu().detach().numpy() cv2.imwrite('test'+str(i)+'.jpg', bev*255/bev.max()) batch_dict["spatial_features"] = bev_features return batch_dict '

Your meaning, just to visualize each channel of the collapsed feature map (B, C, Y, X)

That is the BEV feature you refered to in your question.