WeitaiKang / SegVG

[ECCV 2024] SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding
41 stars 2 forks source link

SegVG

Introduction

This repository is an official PyTorch implementation of the ECCV 2024 paper SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding. Our SegVG transfers the box-level annotation as Segmentation signals to provide an additional pixel-level supervision for Visual Grounding. Additionally, the query, text, and vision tokens are triangularly updated to mitigate domain discrepancy by our proposed Triple Alignment module. Please cite our paper if the paper or codebase is helpful to you.

@article{kang2024segvg,
    title={Segvg: Transferring object bounding box to segmentation for visual grounding},
    author={Kang, Weitai and Liu, Gaowen and Shah, Mubarak and Yan, Yan},
    journal={arXiv preprint arXiv:2407.03200},
    year={2024}}

Installation

  1. Clone this repository.

    git clone https://github.com/WeitaiKang/SegVG.git
  2. Prepare for environment.

    Please refer to ReSC for setting up environment. We use the 1.12.1+cu116 version pytorch.

  3. Prepare for data.

    Please download the coco train2014 images.

    Please download the referring expression annotations from the 'annotation' directory of SegVG.

    Please download the ResNet101 ckpts of vision backbone from TransVG.

    You can place them wherever you want. Just remember to set the paths right in your train.sh and test.sh.

Model Zoo

Our model ckpts are available in the 'ckpt' directory of SegVG.

Model val testA testB
SegVG 86.84 89.46 83.07
Model val testA testB
SegVG 77.18 82.63 67.59
Model val-g val-u test-u
SegVG 76.01 78.35 77.42
Model test
SegVG 75.59

Training and Evaluation

  1. Training

    bash train.sh

    Please take a look of train.sh to set the parameters.

  2. Evaluation

    bash test.sh

    Please take a look of test.sh to set the parameters.

Acknowledge

This codebase is partially based on TransVG.