ma-xu / pointMLP-pytorch

[ICLR 2022 poster] Official PyTorch implementation of "Rethinking Network Design and Local Geometry in Point Cloud: A Simple Residual MLP Framework"
Apache License 2.0
501 stars 65 forks source link

Very slow loss.backward() when running PointMLP on custom task #77

Closed kaimingkuang closed 1 year ago

kaimingkuang commented 1 year ago

Hi,

I am trying to adopt the PointMLP in classification_ModelNet40/models/pointmlp.py on my own task (the default hyperparameter setting). However, the loss.backward() gets super slow (around 9 seconds for one backward for one batch of 64 pointclouds with 1024 points). When I run your own ModelNet40 experiments with the same configs and hardware/software environment, the training speed is normal. Here is my code:

        self.model.train()

        for i, sample in enumerate(self.dl_train):
            self.optimizer.zero_grad()

            pc = sample["xyz"]

            img_feats = sample["img"]
            pc = pc.cuda()
            img_feats = img_feats.cuda()
            pc_feats = self.model(pc)

            pc2img_loss = self.criterion(pc_feats, img_feats)

            pc2img_loss.backward()

            self.optimizer.step()

The loss function is a simple contrastive loss:

class ContrastiveLoss(nn.Module):

    def forward(self, feat_0, feat_1, labels=None):
        feat_0 = F.normalize(feat_0, dim=1)
        feat_1 = F.normalize(feat_1, dim=1)
        dot_prods = torch.einsum("mi,ni->mn", feat_0, feat_1)
        loss_0_1 = -F.log_softmax(dot_prods, dim=0).diag().mean()
        loss_1_0 = -F.log_softmax(dot_prods, dim=1).diag().mean()
        loss = 0.5 * (loss_0_1 + loss_1_0)

        return loss

Here is my hardware/software configs: CPU: Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz GPU: NVIDIA A100 CUDA: 11.1 PyTorch: 1.8.1 Python: 3.7.16 Can you help me with it? Many thanks.

kaimingkuang commented 1 year ago

Switched to PyTorch 1.12.1 and it runs so much faster...