Open Galaxy-ZRX opened 2 years ago
These two dimensions are permuted in https://github.com/XuyangBai/TransFusion/blob/399bda09a3b6449313ccc302df40651f77ec78bf/mmdet3d/ops/voxel/voxelize.py#L95-L105
You should change the config accordingly, i.e.:
pts_middle_encoder=dict(
type='SparseEncoder',
in_channels=5,
sparse_shape=[1, 704, 800],
output_channels=128,
order=('conv', 'norm', 'act'),
encoder_channels=((16, 16, 32), (32, 32, 64), (64, 64, 128), (128, 128)),
encoder_paddings=((0, 0, 1), (0, 0, 1), (0, 0, [0, 1, 1]), (0, 0)),
block_type='basicblock'),
Hi Xuyang, thanks for your work on TransFusion firstly! I am trying to train this model in the KITTI dataset. I noticed that when you calculated the loss of heatmap, the ground truth heatmap is obtained via https://github.com/XuyangBai/TransFusion/blob/53370467c1b88f163cbe7b7300a1f588a6761e35/mmdet3d/models/dense_heads/transfusion_head.py#L1192
As you can see that the size of gt_heatmap a rotated version of the original feature map, could you please tell me why the rotation is used? Actually, when I train the model with KITTI dataset, if I set the point cloud range is set to [0, -40, -3.0, 70.0, 40, 1.0], the predicted heatmap will have a size of 1x1x200x176, but the gt_heatmap will be 1x1x176x200. It seems like the rotation will make them mismatched. In the case of nuScenes and waymo, the heatmap is a square so it's fine, but I can't understand the issue in KITTI cases and don't know how to solve it.
Could you please give me some advise? Thank you very much. I am currently stuck in this problem T-T and looking forward to your reply!