Pointcept / Pointcept

Pointcept: a codebase for point cloud perception research. Latest works: PTv3 (CVPR'24 Oral), PPT (CVPR'24), OA-CNNs (CVPR'24), MSC (CVPR'23)
MIT License
1.52k stars 163 forks source link

Why still have KNN in code if we use grid pooling to replace FPS+KNN? #245

Open Stronger-Huang opened 4 months ago

Stronger-Huang commented 4 months ago

Thank you for your great work about PTv2!

class BlockSequence(nn.Module):
...
def forward(self, points):
        coord, feat, offset = points 
        # reference index query of neighbourhood attention
        # for windows attention, modify reference index query method
        reference_index, _ = pointops.knn_query(self.neighbours, coord, offset) # self.neighbours=16 ,return idx, torch.sqrt(dist2)
        for block in self.blocks:
            points = block(points, reference_index)
        return points

Here in class BlockSequence, I find that the layer after grid pooling will be knn block. But paper PTv2 mentioned that FPS+KNN will be replaced by grid pooling and miou is better. So Why the code still adds KNN?

I would appreciate it very much if you can reply!

Gofinge commented 4 months ago

Because Neighborhood Attention still needs KNN to determine the kernel. This is not for the pooling layer.

Stronger-Huang commented 4 months ago

so in fact this class 'GridPool' below can replace FPS + KNN?

` class GridPool(nn.Module):

def __init__(self, in_channels, out_channels, grid_size, bias=False):
    super(GridPool, self).__init__()
    self.in_channels = in_channels
    self.out_channels = out_channels
    self.grid_size = grid_size

    self.fc = nn.Linear(in_channels, out_channels, bias=bias)
    self.norm = PointBatchNorm(out_channels)
    self.act = nn.ReLU(inplace=True)
def forward(self, points, start=None):
    coord, feat, offset = points
    batch = offset2batch(offset)
    feat = self.act(self.norm(self.fc(feat)))
    start = (
        segment_csr(
            coord,
            torch.cat([batch.new_zeros(1), torch.cumsum(batch.bincount(), dim=0)]),
            reduce="min",
        )
        if start is None
        else start
    )
    cluster = voxel_grid(
        pos=coord - start[batch], size=self.grid_size, batch=batch, start=0
    )
    unique, cluster, counts = torch.unique(
        cluster, sorted=True, return_inverse=True, return_counts=True
    )
    _, sorted_cluster_indices = torch.sort(cluster)
    idx_ptr = torch.cat([counts.new_zeros(1), torch.cumsum(counts, dim=0)])
    coord = segment_csr(coord[sorted_cluster_indices], idx_ptr, reduce="mean")
    feat = segment_csr(feat[sorted_cluster_indices], idx_ptr, reduce="max")
    batch = batch[idx_ptr[:-1]]
    offset = batch2offset(batch)
    return [coord, feat, offset], cluster`

And the knn in blocksequence is just for neightbour attention? I would appreciate it very much if you can reply !!

Gofinge commented 4 months ago

Yes. And in our PTv3, KNN is fully removed from the pipeline.