mit-han-lab / spvnas

[ECCV 2020] Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution
http://spvnas.mit.edu/
MIT License
587 stars 109 forks source link

About training on other datasets #97

Closed Talnex closed 2 years ago

Talnex commented 2 years ago

Thanks for the great work!

I got poor performance on Semantic3D dataset with dense point clouds and RGB information. I made the following changes, any advice is greatly appreciated!!

  1. I changed the dimension of the input SparseTensor like this:
    pc = torch.IntTensor(1000, 4)  # (x, y, z, batch)
    feat = torch.FloatTensor(1000, 6) #(x, y, z, r, g, b)
    lidar = SparseTensor(feat, pc)
  2. I changed the stem input dimension from 4 to 6

    https://github.com/mit-han-lab/spvnas/blob/69750e900d8687ac9fcc8e042b171cd1f6beffa1/core/models/semantic_kitti/spvcnn.py#L95

  3. I increased the number of input points and changed grid_size.
  4. The dataset is downsampled, and a small area is taken for training each time.

I'm wondering, is SPVCNN inappropriate for this kind of dense scan point cloud, or is there something wrong with my steps?

Thanks for your guidance.

zhijian-liu commented 2 years ago

@Talnex, we haven't tried Semantic3D. It's possible that sparse convolution is not very suitable for this type of point cloud.