mit-han-lab / spvnas

[ECCV 2020] Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution
http://spvnas.mit.edu/
MIT License
587 stars 109 forks source link

Data Preprocessing #93

Closed tomanick closed 2 years ago

tomanick commented 2 years ago

I wanted to train SPVCNN on my own datasets which were cad files for part segmentation. First of all, I transformed cad files to point clouds and the format of point clouds were [point clouds numbers, xyz]. Secondly, I normalized the point clouds and I used the sparse_quantize to transform these data.

Data Preprocessing

pc_coords = [pc_numbers, xyz]
point_set = np.concatenate([pc_coords, coords_norm], axis=1)

coords, features = point_sets[:, :3], point_set
coords -= np.min(coords, axis=0, keepdims=True)
coords, indices, inverse_inds = sparse_quantize(coords, voxel_size, return_index=True, return_inverse=True)

coords = torch.tensor(coords, dtype=torch.int)
features = torch.tensor(features[indices], dtype=torch.float)
labels_inds = torch.tensor(labels[indices], dtype=torch.long)

input_data = SparseTensor(coords=coords, feats=features)
target = SparseTensor(coords=coords, feats=labels_inds)
ori_pc_label = SparseTensor(coords=pc_coords, feats=labels)
inverse_map = SparseTensor(coords=pc_coords, feats=inverse_inds)

feed_dict = {'input': input_data, "label": target, "ori_pc_label": ori_pc_label, "inverse_map": inverse_map}

Then, importing sparse_collate_fn from torchsparse.utils.collate and setting collate_fn=sparse_collate_fn in PyTorch DataLoader. Hence, the input data format of SparseTensor before sending to SPVCNN was [Features (6 dimensions and they are xyz and xyz_normalization), Coordinates (4 dimensions and they are x, y, z and batch index)] after the collate_fn of PyTorch DataLoader.

Results

I have written the code of data preprocessing and training but the result was weird. I used the inverse_map to make outputs map to point clouds and calculate the accuracy with original point clouds' groundtruth. However, the accuracy of SPVCNN was much lower than PVCNN's result.

Is there any mistake in my data preprocessing? I hope someone could help me. Thanks!

zhijian-liu commented 2 years ago

Hi @tomanick, thanks for your detailed summary. One thing to keep in mind is that SPVCNN is not always better than PVCNN. This is detailed in the journal version of our paper (https://arxiv.org/abs/2204.11797, see Table 2). The main takeaway is that SPVCNN is suitable for large scenes, while PVCNN is suitable for small objects.

tomanick commented 2 years ago

OK. Thanks for your reply. In addition, I found that if point cloud numbers are not very large like 2048 or 4096, it is not necessary for those data to do sparse_quantize for data-preprocessing. After doing sparse_quantize, the accuracy of SPVCNN became lower. When I removed sparse_quantize, the accuracy became better, but it was still lower than PVCNN's result.