MohamedAfham / CrossPoint

Official implementation of "CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding" (CVPR, 2022)
https://mohamedafham.github.io/CrossPoint/
244 stars 28 forks source link

relatively large performance gap on ScanObjectNN #13

Open auniquesun opened 2 years ago

auniquesun commented 2 years ago

@MohamedAfham Recently, I have run all experiments in the codebase at least 3 times to ensure there are not explicit exceptions during my operations.

Some of the results are very encouraging, which means they are comparable with the paper reported, sometimes even higher than that in the paper, e.g. the reproduced results on ModelNet. But some are not.

Specifically, for the downstream task few-shot classification on ScanObjectNN, the performance gap is relatively large, e.g.,

  1. for 5 way, 10 shot, I got 72.5 ± 8.33,
  2. for 5 way, 20 shot, I got 82.5 ± 5.06,
  3. for 10 way, 10 shot, I got 59.4 ± 3.95,
  4. for 10 way, 20 shot, I got 67.8 ± 4.41

For the downstream task linear SVM classification on ScanObjectNN, the reproduced performance is 75.73%. All experiments use the DGCNN backbone and default settings except for the batch size.

In short, all of results are behind the reported peformances on ScanObjectNN in the paper, by a large margin.

At this point, I wonder whether there are some precautions when experimenting on ScanObjectNN, and what possible reasons are. Can you provide some suggestions? thank you.

MohamedAfham commented 2 years ago

Hey,

Thanks for raising the issue. I would like to inform that my schedule is bit tight these days and it might take some time to re-run these experiments. However, I'll respond to you as soon as I'm available.

Having said that, can you give me some clarifications here.

  1. Did you use the same ScanObjectNN dataset given here.
  2. Did you train the model from the scratch and evaluate the results or did you use the pre-trained model provided here to evaluate.

Thanks.

auniquesun commented 2 years ago

Answer to Q1: Yes, I use the same ScanObjectNN as you provide.

Answer to Q2: I train the model from scratch then evalute the results. I do not use the pre-trained model.

More information: I pre-trained the model on 6 GPUs powered by DistributedDataParallel mode. After that, few-shot learning was conduct on a single GPU. And the experiments of linear SVM classification also took place on a single GPU device.

Feel free to ask me to add more clarifications. Thanks.

keeganhk commented 2 years ago

@auniquesun @MohamedAfham Hi. Recently, I am also running this code, on a single RTX 3090 GPU. However, it seems that the time cost of the pretraining process (i.e., pretraining the DGCNN model on ShapeNetRender) is very high, which is about a week. So I would like to ask if this is normal.

auniquesun commented 2 years ago

@auniquesun @MohamedAfham Hi. Recently, I am also running this code, on a single RTX 3090 GPU. However, it seems that the time cost of the pretraining process (i.e., pretraining the DGCNN model on ShapeNetRender) is very high, which is about a week. So I would like to ask if this is normal.

I am not sure whether it is normal, but I can tell you my situation: with 6 RTX 2080Ti GPUs, pretraining on ShapeNetRender takes about 6.5 hours.