suoych / TAS

Implementation of the paper Text Augmented Spatial Aware Zero-shot Referring Image Segmentation (Findings of EMNLP 2023)

Apache License 2.0

5 stars 0 forks source link

readme

TAS

Introduction

Implementation of the paper Text Augmented Spatial-aware Zero-shot Referring Image Segmentation (EMNLP Findings 2023)

Preparation

Download Dataset (RefCOCO, RefCOCO+, RefCOCOg) and put in "../refer"
Prepare SAM-H, CLIP and BLIP-2 model
Prepare captions for images (Using BLIP-2)
Install the environment requirements (pip install -r requirements.txt). For syntactic parsing tools, you need to manually install some extension (en-core-web-trf in spacy, wordnet in nltk)

Usage

python tas_main.py --config config/refcoco/refcoco_val.json

Acknowledgements

The repo is derived from the Grounded Segment Anything project.
If you have question, feel free to drop me an e-mail