Wuziyi616 / SlotDiffusion

Code release for NeurIPS 2023 paper SlotDiffusion: Object-centric Learning with Diffusion Models
https://slotdiffusion.github.io/
MIT License
78 stars 7 forks source link
computer-vision diffusion-models object-centric-learning pytorch unsupervised-segmentation vqa

SlotDiffusion

SlotDiffusion: Unsupervised Object-Centric Learning with Diffusion Models
Ziyi Wu, Jingyu Hu, Wuyue Lu, Igor Gilitschenski, Animesh Garg
NeurIPS'23 | GitHub | arXiv | Project page

Unsupervised Video Object Segmentation

Video GT     Ours  
MOVi-D seg
MOVi-E seg

Slot-based Image Editing

Introduction

This is the official PyTorch implementation for paper: [SlotDiffusion: Unsupervised Object-Centric Learning with Diffusion Models](). The code contains:

Update

Installation

Please refer to install.md for step-by-step guidance on how to install the packages.

Experiments

This codebase is tailored to Slurm GPU clusters with preemption mechanism. For the configs, we mainly use A40 with 40GB memory (though many experiments don't require so much memory). Please modify the code accordingly if you are using other hardware settings:

Dataset Preparation

Please refer to data.md for dataset downloading and pre-processing.

Reproduce Results

Please see benchmark.md for detailed instructions on how to reproduce our results in the paper.

Possible Issues

See the troubleshooting section of nerv for potential issues.

Please open an issue if you encounter any errors running the code.

Citation

Please cite our paper if you find it useful in your research:

@article{wu2023slotdiffusion,
  title={SlotDiffusion: Object-Centric Generative Modeling with Diffusion Models},
  author={Wu, Ziyi and Hu, Jingyu and Lu, Wuyue and Gilitschenski, Igor and Garg, Animesh},
  journal={NeurIPS},
  year={2023}
}

Acknowledgement

We thank the authors of Slot Attention, slot_attention.pytorch, SAVi, SLATE, STEVE, Latent Diffusion Models, DPM-Solver, DINOSAUR, MaskContrast and SlotFormer for opening source their wonderful works.

License

SlotDiffusion is released under the MIT License. See the LICENSE file for more details.

Contact

If you have any questions about the code, please contact Ziyi Wu dazitu616@gmail.com