OpenRobotLab / OV_PARTS

[NeurIPS 2023] OV-PARTS: Towards Open-Vocabulary Part Segmentation
70 stars 6 forks source link


OV-PARTS: Towards Open-Vocabulary Part Segmentation

Meng WeiXiaoyu YueWenwei ZhangXihui LiuShu KongJiangmiao Pang*
Shanghai AI Laboratory The University of Hong Kong The University of Sydney University of Macau Texas A&M University

🏠 About

OV-PARTS is a benchmark for Open-Vocabulary Part Segmentation by using the capabilities of large-scale Vision-Language Models (VLMs).

🔥 News

We organize the Open Vocabulary Part Segmentation (OV-PARTS) Challenge in the Visual Perception via Learning in an Open World (VPLOW) Workshop. Please check our website!

đź›  Getting Started

Installation

  1. Clone this repository

    git clone https://github.com/OpenRobotLab/OV_PARTS.git
    cd OV_PARTS
  2. Create a conda environment with Python3.8+ and install python requirements

    conda create -n ovparts python=3.8
    conda activate ovparts
    pip install -r requirements.txt

Data Preparation

After downloading the two benchmark datasets, please extract the files by running the following command and place the extracted folder under the "Datasets" directory.

  tar -xzf PascalPart116.tar.gz
  tar -xzf ADE20KPart234.tar.gz

The Datasets folder should follow this structure:

  Datasets/
  ├─Pascal-Part-116/
  │ ├─train_16shot.json
  │ ├─images/
  │ │ ├─train/
  │ │ └─val/
  │ ├─annotations_detectron2_obj/
  │ │ ├─train/
  │ │ └─val/
  │ └─annotations_detectron2_part/
  │   ├─train/
  │   └─val/
  └─ADE20K-Part-234/
    ├─images/
    │ ├─training/
    │ ├─validation/
    ├─train_16shot.json
    ├─ade20k_instance_train.json
    ├─ade20k_instance_val.json
    └─annotations_detectron2_part/
      ├─training/
      └─validation/

Create {train/val}_{obj/part}_label_count.json files for Pascal-Part-116.

  python baselines/data/datasets/mask_cls_collect.py Datasets/Pascal-Part-116/annotations_detectron2_{obj/part}/{train/val} Datasets/Pascal-Part-116/annotations_detectron2_part/{train/val}_{obj/part}_label_count.json

Training

  1. Training the two-stage baseline ZSseg+.

    Please first download the clip model fintuned with CPTCoOp.

    Then run the training command:

    python train_net.py --num-gpus 8 --config-file configs/${SETTING}/zsseg+_R50_coop_${DATASET}.yaml
  2. Training the one-stage baselines CLIPSeg and CATSeg.

    Please first download the pre-trained object models of CLIPSeg and CATSeg and place them under the "pretrain_weights" directory.

    Models Pre-trained checkpoint
    CLIPSeg download
    CATSeg download

    Then run the training command:

    # For CATseg.
    python train_net.py --num-gpus 8 --config-file configs/${SETTING}/catseg_${DATASET}.yaml
    
    # For CLIPseg.
    python train_net.py --num-gpus 8 --config-file configs/${SETTING}/clipseg_${DATASET}.yaml

Evaluation

We provide the trained weights for the three baseline models reported in the paper.

Models Setting Pascal-Part-116 checkpoint ADE20K-Part-234 checkpoint
ZSSeg+ Zero-shot download download
CLIPSeg Zero-shot download download
CatSet Zero-shot download download
CLIPSeg Few-shot download download
CLIPSeg cross-dataset - download

To evaluate the trained models, add --eval-only to the training command.

For example:

  python train_net.py --num-gpus 8 --config-file configs/${SETTING}/catseg_${DATASET}.yaml --eval-only MODEL.WEIGHTS ${WEIGHT_PATH}

đź“ť Benchmark Results

đź”— Citation

If you find our work helpful, please cite:

@inproceedings{wei2023ov,
  title={OV-PARTS: Towards Open-Vocabulary Part Segmentation},
  author={Wei, Meng and Yue, Xiaoyu and Zhang, Wenwei and Kong, Shu and Liu, Xihui and Pang, Jiangmiao},
  booktitle={Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
  year={2023}
}

đź‘Ź Acknowledgements

We would like to express our gratitude to the open-source projects and their contributors, including ZSSeg, CATSeg and CLIPSeg. Their valuable work has greatly contributed to the development of our codebase.