tyhuang0428 / CLIP2Point

[ICCV 2023] CLIP2Point: Transfer CLIP to Point Cloud Classification with Image-Depth Pre-training
104 stars 8 forks source link

How to reproducing the zero-shot result 49.38% on ModelNet40 ? #3

Open TangYuan96 opened 2 years ago

TangYuan96 commented 2 years ago

First of all, thanks for sharing your outstanding work

I try to train zero-shot (python pretraining. py) on ModelNet40Align and ModelNet40Ply,

the results can only reach 36.71% and 32.70%

But the result in the paper is 49.38%.

Can you retrain and get 49.38% results? What should I pay attention to during the training

Thanks!

tyhuang0428 commented 2 years ago

You said that you use the command "python pretraining. py", while I am not sure whether you are reproducing zero-shot classification or the image-depth pre-training. If you mean zero-shot, you can use the following command,

python zeroshot.py --ckpt [pre-trained_ckpt_path]

Otherwise, your claimed results (36.71% and 32.70%) may come from the validation set of our pre-training, which is a little bit different from our zero-shot setting. The accuracy of the validation set can reach 42.83% during our pre-training. And we find that the batch size can significantly affect the pre-training. See if these can help you.

tyhuang0428 commented 2 years ago

We found a bug in the pre-training code, and have already fixed it in the latest commit. We wrongly rotate the CAD models in ShapeNet when rendering depth maps. Rotation is needed only in downstream tasks.

Sorry for the confusion. Please let us know if we can assist with anything else.

Berta911 commented 6 months ago

How can I reproduce the zero-shot reasoning on scanobjectnn to achieve an accuracy of 35.46, when I can only achieve 13%? And what changes do I need to make to the code?