VATEX |
Vision-Aware Text Features in Referring Image Segmentation: From Object Understanding to Context Understanding |
WACV 2025 |
[code] [webpage] |
Shared-RIS |
A Simple Baseline with Single-encoder for Referring Image Segmentation |
arxiv 24.08 |
[code] |
ASDA |
Adaptive Selection based Referring Image Segmentation |
ACM MM 2024 |
code |
NeMo |
Finding NeMo: Negative-mined Mosaic Augmentation for Referring Image Segmentation |
ECCV 2024 |
[webpage] [code] |
ReMamber |
ReMamber: Referring Image Segmentation with Mamba Twister |
ECCV 2024 |
[code] |
GTMS |
GTMS: A Gradient-driven Tree-guided Mask-free Referring Image Segmentation Method |
ECCV 2024 |
[code] |
SAM4MLLM |
SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring Expression Segmentation |
ECCV 2024 |
[code] |
Pseudo-RIS |
Pseudo-RIS: Distinctive Pseudo-supervision Generation for Referring Image Segmentation |
ECCV 2024 |
[code] |
SafaRi |
SafaRi: Adaptive Sequence Transformer for Weakly Supervised Referring Expression Segmentation |
ECCV 2024 |
[webpage] |
CM-MaskSD |
CM-MaskSD: Cross-Modality Masked Self-Distillation for Referring Image Segmentation |
TMM 2024 |
|
Prompt-RIS |
Prompt-Driven Referring Image Segmentation with Instance Contrasting |
CVPR 2024 |
|
LQMFormer |
LQMFormer: Language-aware Query Mask Transformer for Referring Image Segmentation |
CVPR 2024 |
|
PPT |
Curriculum Point Prompting for Weakly-Supervised Referring Image Segmentation |
CVPR 2024 |
|
GSVA |
GSVA: Generalized Segmentation via Multimodal Large Language Models |
CVPR 2024 |
[code] |
RMSIN |
Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation |
CVPR 2024 |
[code] |
MRES |
Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentation |
CVPR 2024 |
[code] [webpage] |
MagNet |
Mask Grounding for Referring Image Segmentation |
CVPR 2024 |
[webpage] |
LISA |
LISA: Reasoning Segmentation via Large Language Model |
CVPR 2024 |
[code] |
RefSegformer |
Towards Robust Referring Image Segmentation |
TIP 2024 |
[code] |
JMCELN |
Referring Image Segmentation via Joint Mask Contextual Embedding Learning and Progressive Alignment Network |
EMNLP 2023 |
[code] |
CVMN |
Unsupervised Domain Adaptation for Referring Semantic Segmentation |
ACM MM 2023 |
[code] |
CARIS |
CARIS: Context-Aware Referring Image Segmentation |
ACM MM 2023 |
[code] |
TAS |
Text Augmented Spatial-aware Zero-shot Referring Image Segmentation |
EMNLP 2023 |
|
BKINet |
Bilateral Knowledge Interaction Network for Referring Image Segmentation |
TMM 2023 |
[code] |
Group-RES |
Advancing Referring Expression Segmentation Beyond Single Image |
ICCV 2023 |
[code] |
|
Weakly Supervised Referring Image Segmentation with Intra-Chunk and Inter-Chunk Consistency |
ICCV 2023 |
|
|
Shatter and Gather: Learning Referring Image Segmentation with Text Supervision |
ICCV 2023 |
|
TRIS |
Referring Image Segmentation Using Text Supervision |
ICCV 2023 |
[code] |
RIS-DMMI |
Beyond One-to-One: Rethinking the Referring Image Segmentation |
ICCV 2023 |
[code] |
ETRIS |
Bridging Vision and Language Encoders: Parameter-Efficient Tuning for Referring Image Segmentation |
ICCV 2023 |
[code] |
SEEM |
Segment Everything Everywhere All at Once |
arXiv 23.04 |
[code] |
SLViT |
SLViT: Scale-Wise Language-Guided Vision Transformer for Referring Image Segmentation |
IJCAI 2023 |
[code] |
WiCo |
WiCo: Win-win Cooperation of Bottom-up and Top-down Referring Image Segmentation |
IJCAI 2023 |
|
M3Att |
Multi-Modal Mutual Attention and Iterative Interaction for Referring Image Segmentation |
TIP 2023 |
|
X-Decoder |
X-Decoder: Generalized Decoding for Pixel, Image and Language |
CVPR 2023 |
[code] [project] |
Partial-RES |
Learning to Segment Every Referring Object Point by Point |
CVPR 2023 |
[code] |
MCRES |
Meta Compositional Referring Expression Segmentation |
CVPR 2023 |
|
Global-Local CLIP |
Zero-shot Referring Image Segmentation with Global-Local Context Features |
CVPR 2023 |
[code] |
PolyFormer |
PolyFormer: Referring Image Segmentation as Sequential Polygon Generation |
CVPR 2023 |
[code] [project] |
GRES |
GRES: Generalized Referring Expression Segmentation |
CVPR 2023 |
[code] [dataset] [project] |
CGFormer |
Contrastive Grouping with Transformer for Referring Image Segmentation |
CVPR 2023 |
[code] |
SADLR |
Semantics-Aware Dynamic Localization and Refinement for Referring Image Segmentation |
AAAI 2023 |
|
R-RIS |
Towards Robust Referring Image Segmentation |
arXiv 22.09 |
[code] [project] |
- |
Learning From Box Annotations for Referring Image Segmentation |
TNNLS 2022 |
[code] |
- |
Instance-Specific Feature Propagation for Referring Segmentation |
TMM 2022 |
|
LAVT |
LAVT: Language-Aware Vision Transformer for Referring Image Segmentation |
CVPR 2022 |
[code] |
CRIS |
CRIS: CLIP-Driven Referring Image Segmentation |
CVPR 2022 |
[code] |
ReSTR |
ReSTR: Convolution-free Referring Image Segmentation Using Transformers |
CVPR 2022 |
[project] |
TV-Net |
Two-stage Visual Cues Enhancement Network for Referring Image Segmentation |
ACM MM 2021 |
[code] |
VLT |
Vision-Language Transformer and Query Generation for Referring Segmentation |
ICCV 2021 |
[code] |
MDETR |
MDETR - Modulated Detection for End-to-End Multi-Modal Understanding |
ICCV 2021 |
[code] [project] |
CEFNet |
Encoder Fusion Network with Co-Attention Embedding for Referring Image Segmentation |
CVPR 2021 |
[code] |
BUSNet |
Bottom-Up Shift and Reasoning for Referring Image Segmentation |
CVPR 2021 |
[code] |
LTS |
Locate then Segment: A Strong Pipeline for Referring Image Segmentation |
CVPR 2021 |
|
CGAN |
Cascade Grouped Attention Network for Referring Expression Segmentation |
ACM MM 2020 |
|
LSCM |
Linguistic Structure Guided Context Modeling for Referring Image Segmentation |
ECCV 2020 |
[code] |
CMPC-Refseg |
Referring Image Segmentation via Cross-Modal Progressive Comprehension |
CVPR 2020 |
[code] |
BRINet |
Bi-directional Relationship Inferring Network for Referring Image Segmentation |
CVPR 2020 |
[code] |
PhraseCut |
PhraseCut: Language-based Image Segmentation in the Wild |
CVPR 2020 |
[code] [project] |
MCN |
Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation |
CVPR 2020 |
[code] |
- |
Dual Convolutional LSTM Network for Referring Image Segmentation |
TMM 2020 |
|
STEP |
See-Through-Text Grouping for Referring Image Segmentation |
ICCV 2019 |
|
lang2seg |
Referring Expression Object Segmentation with Caption-Aware Consistency |
BMVC 2019 |
[code] |
CMSA |
Cross-Modal Self-Attention Network for Referring Image Segmentation |
CVPR 2019 |
[code] |
KWA |
Key-Word-Aware Network for Referring Expression Image Segmentation |
ECCV 2018 |
[code] |
DMN |
Dynamic Multimodal Instance Segmentation Guided by Natural Language Queries |
ECCV 2018 |
[code] |
RRN |
Referring Image Segmentation via Recurrent Refinement Networks |
CVPR 2018 |
[code] |
MAttNet |
MAttNet: Modular Attention Network for Referring Expression Comprehension |
CVPR 2018 |
[code] [Demo] |
RMI |
Recurrent Multimodal Interaction for Referring Image Segmentation |
ICCV 2017 |
[code] |
LSTM-CNN |
Segmentation from natural language expressions |
ECCV 2016 |
[code] [project] |