Hi All, It seems there is a bug in the PointPillar when training on multi-GPU. I think it is the batch_idx in voxel coordinates. Say there are 4 samples in a batch [0,1,2,3] and two GPU. When split to the two GPUs, the first has [0,1] and the second has [2,3]. It is okay for the first gpu, but for the second gpu, the PointPillarsScatter only count from 0. So for the data on the second GPU, it is equivalent to an empty point cloud. One temporary remedy I use now is to add a line:
coors[:,0] -= coors[:,0].min() after line 370 in voxelnet.py
Hi All, It seems there is a bug in the PointPillar when training on multi-GPU. I think it is the batch_idx in voxel coordinates. Say there are 4 samples in a batch [0,1,2,3] and two GPU. When split to the two GPUs, the first has [0,1] and the second has [2,3]. It is okay for the first gpu, but for the second gpu, the PointPillarsScatter only count from 0. So for the data on the second GPU, it is equivalent to an empty point cloud. One temporary remedy I use now is to add a line:
coors[:,0] -= coors[:,0].min() after line 370 in voxelnet.py