Official implementation of Geometrically-driven Aggregation for Zero-shot 3D Point Cloud Understanding (GeoZe).
Guofeng Mei, Luigi Riz, Yiming Wang, Fabio Poiesi
Technologies of Vision (TeV), Foundation Bruno Kessler
{gmei, luriz, ywang, poiesi}@fbk.eu
CVPR 2024 Project Page | Arxiv Paper
We introduce the first training-free aggregation technique that leverages the point cloud’s 3D geometric structure to improve the quality of the transferred VLM representations.
Our approach first clusters point cloud ${\mathcal{P}}$ into superpoints $\bar{{\mathcal{P}}}$ along with their associated geometric representation $\bar{{\mathcal{G}}}$, VLM representation $\bar{{\mathcal{F}}}$, and anchors ${{\mathcal{C}}}$. For each superpoint $\bar{{p}_j}$, we identify its $knn$ within the point cloud to form a patch ${\mathcal{P}}^j$ with their features ${\mathcal{G}}^j$ and ${\mathcal{F}}^j$. For each patch, we perform a local feature aggregation to refine the VLM representations ${{\mathcal{F}}}$. The superpoints then undergo a process of global aggregation. A global-to-local aggregation process is applied to update the per-point features. Lastly, we employ the VLM feature anchors to further refine per-point features, which are then ready to be utilized for downstream tasks.
Prepare environment for part segmentation
conda install pytorch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 pytorch-cuda=11.8 -c pytorch -c nvidia
pip install open-clip-torch==2.24.0
pip install open3d natsort matplotlib tqdm opencv-python scipy plyfile
Part segmentation on ShapeNet
python part_run.py --datasetpath Your_shapenet_path
We are very much welcome all kinds of contributions to the project.
If you find our code or paper useful, please cite
@inproceedings{mei2024geometrically,
title = {Geometrically-driven Aggregation for Zero-shot 3D Point Cloud Understanding},
author = {Mei, Guofeng and Riz, Luigi and Wang, Yiming and Poiesi, Fabio},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2024}
This repo benefits from PointCLIPV2, CLIP, and OpenScene. Thanks for their wonderful works.