Awesome works on Unsupervised Object Localization

🌟 We propose here a curated list of recent works that perform unsupervised object localization.

📝 If you found a missing paper (either yours or someone else's), don't hesitate to create a pull request. \ (Last update 16th of Oct. 2023)

📚 Many of these works are discussed in our survey paper on methods which leverage ViTs self-supervised features and do not use any manual annotation.

Unsupervised Object Localization in the Era of Self-Supervised ViTs: A Survey \ by Oriane Siméoni, Eloi Zablocki, Spyros Gidaris, Gilles Puy and Patrick Pérez \ [paper]

Table of Content

🚀 Training-free object localization with ViT self-supervised features
- Single object-discovery
- Multi-object discovery
🏋 With training object localization using ViT self-supervised features
Self-supervised features used for the task

🚀 Training-free object localization with ❄ ViT self-supervised features ❄

In this section we report methods that solely exploit self-supervised features without requiring a training step.

Single object-discovery

<< LOST >> \ Localizing Objects with Self-Supervised Transformers and no Labels, BMVC 2021 [paper] [code]

<< TokenCut >> \ Self-supervised Transformers for Unsupervised Object Discovery using Normalized Cut, CVPR 2022 [paper] [code] \ TokenCut: Segmenting Objects in Images and Videos with Self-supervised Transformer and Normalized Cut [paper] [code]

<< DSM >> \ Deep Spectral Methods: A Surprisingly Strong Baseline for Unsupervised Semantic Segmentation and Localization, CVPR 2022 [paper] [code]

Multi-object discovery

<< MOST >> \ MOST: Multiple Object localization with Self-supervised Transformers for object discovery, ICCV 2023 [paper] [code]

🏋 With training 🏋 object localization using ViT self-supervised features

We now report methods which integrate a training step.

Unsupervised saliency detection

<< SelfMask >> \ Unsupervised Salient Object Detection with Spectral Cluster Voting, CVPRW 2022 [paper] [code]

<< MOVE >> \ MOVE: Unsupervised Movable Object Segmentation and Detection, NeurIPS 2022 [paper] [code]

<< FOUND >> \ Unsupervised Object Localization: Observing the Background to Discover Objects, CVPR 2023 [paper] [code]

<< UCOS-DA >> \ Unsupervised camouflaged object segmentation as domain adaptation, ICCVW 2023 [paper] [code]

<< UOLwRPS >> \ Unsupervised object localization with representer point selection, ICCV 2023 [paper] [code]

<< SEMPART >> \ Sempart: Self-supervised Multi-resolution Partitioning of Image Semantics, ICCV 2023 [paper]

<< PaintSeg >> \ PaintSeg: Training-free Segmentation via Painting, NeurIPS 2023 [paper] [code]

Class-agnostic multi-object detection/instance segmentation

<< FreeSolo >> \ FreeSOLO: Learning to Segment Objects without Annotations, CVPR 2022 [paper] [code]

<< DeepCut >> \ Deepcut: Unsupervised segmentation using graph neural networks clustering, arxiv 2022 [paper]

<< IMST >> \ K-means for unsupervised instance segmentation using a self-supervised transformer, arxiv 2022 [paper]

<< MaskDistill >> \ Discovering Object Masks with Transformers for Unsupervised Semantic Segmentation, arxiv 2022 [paper] [code]

<< UMOD >> \ Image segmentation-based unsupervised multiple objects discovery, WACV 2023 [paper]

<< CutLer >> \ Cut and Learn for Unsupervised Image & Video Object Detection and Instance Segmentation, CVPR 2023 [paper] [code] \ VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation, arxiv 2023 [paper] [code]

<< Exemplar FreeSolo >> \ Exemplar-FreeSOLO: Enhancing Unsupervised Instance Segmentation with Exemplars, CVPR 2023 [paper]

Improving self-supervised features

<< WSCUOD >> \ Weakly-supervised Contrastive Learning for Unsupervised Object Discovery, arxiv 2023 [paper] [code]

<< Box-based refinement >> \ Box-based Refinement for Weakly Supervised and Unsupervised Localization Tasks, ICCV 2023 [paper] [code]

Self-supervised features used for the task

<< DINO >> \ Emerging Properties in Self-Supervised Vision Transformers, ICCV 2021 [paper] [code]

<< MOCOv2 >> \ Improved Baselines with Momentum Contrastive Learning, arxiv 2020 [paper]

<< SimSiam >> \ Exploring Simple Siamese Representation Learning, CVPR 2020 [paper] [code]

<< BYOL >> \ Bootstrap your own latent: A new approach to self-supervised Learning, NeurIPS 2020 [paper] [code]

<< SwAV >> \ Unsupervised Learning of Visual Features by Contrasting Cluster Assignments, NeurIPS 2020 [paper] [code]

<< DenseCL >> \ Dense Contrastive Learning for Self-Supervised Visual Pre-Training, CVPR 2021 [paper] [code]

<< MOCOv3 >> \ An Empirical Study of Training Self-Supervised Vision Transformers, ICCV 2021 [paper] [code]

<< MAE >> \ Masked Autoencoders Are Scalable Vision Learners, CVPR 2022 [paper] [code]

<< Stable Diffusion >> \ High-resolution image synthesis with latent diffusion models., CVPR 2022 [paper] [code]

<< DINOv2 >> \ DINOv2: Learning Robust Visual Features without Supervision, arxiv 2023 [paper] [code] \ Vision Transformers Need Registers, arxiv 2023 [paper]

If you found our survey useful for your research, please consider citing:

@article{simeoni2024survey,
  author    = {Sim{\'e}oni, Oriane and Zablocki, {\'E}loi and Gidaris, Spyros and Puy, Gilles and P{\'e}rez, Patrick},
  title     = {Unsupervised Object Localization in the Era of Self-Supervised ViTs: A Survey},
  journal   = {IJCV},
  year      = {2024}
}

valeoai / Awesome-Unsupervised-Object-Localization

readme