hansen7 / OcCo

[ICCV '21] "Unsupervised Point Cloud Pre-training via Occlusion Completion"
https://hansen7.github.io/OcCo/
MIT License
235 stars 26 forks source link

Why mask point cloud by camera view? #19

Closed JasonNie96 closed 3 years ago

JasonNie96 commented 3 years ago

Hi Hanchen,

Thanks for sharing your work!

I have two questions and it would be great if you can help me figure this out!

  1. Why do you generate incomplete point clouds by masking points occluded by a camera view? What's the difference if you just use a usual incomplete point cloud?

  2. What's the difference between this work and PCN? I noticed you directly use PCN network architecture as your model and in my understanding, you just apply PCN to some downstream tasks. If you directly used PCN pre-trained weights, can they also get the improvements on these downstream tasks?

hansen7 commented 3 years ago

Hi Jason. Thanks for your interest and attention :)

Re Q1: We use the masked point cloud from camera viewpoints because it is closer to the observation in real life. Therefore we believe this setting is better than the randomly dropped incomplete point cloud (also, it is the same way to construct datasets like ScanObjectNN). We haven't investigated such data, so I am not sure about the empirical difference in experiments.

Re Q2: There are no advances in developing the design of neural networks, same as PointContrast (Model: Minkowski ConvNet) and SimCLR (Model: ResNet). Instead, we demonstrated that such occlusion completion is an effective pretext task for the unsupervised pre-training for point cloud models and conducted a few probe tests on what and how the pre-training helps. Actually, at the very beginning, I am exactly trying with PCN pre-trained weights, and find that it is sometimes useful, which motivates us to conduct this work.

JasonNie96 commented 3 years ago

Thanks for your quick response! This really helps!

You mentioned PointContrast in Re Q2. May I ask did you try applying your pre-trained model on the object detection task? If so, what's the performance on it?

hansen7 commented 3 years ago

I haven't tried with detection, I know many detection frameworks (such as PointPillars) using some MLP (mini-PointNet), but I haven't tried these as the model architecture do not have exact match.

JasonNie96 commented 3 years ago

I see. Thank you!

hansen7 commented 3 years ago

no worries, hope it helps :)