prismformore / DiffusionMTL

Code of our CVPR2024 paper - DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data
37 stars 0 forks source link

:fire: [CVPR2024] DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data

Paper     Website    

:scroll: Introduction

This repository contains the codes and models for DiffusionMTL, our multi-task scene understanding model trained with partially annotated data.

Please check the CVPR 2024 paper for more details.

Overview of the proposed DiffusionMTL for multi-task scene understanding.


:triangular_flag_on_post: Updates

:grinning: Train your model!

1. Build recommended environment

We inherit the environement of TaskPrompter, and here is a successful path to deploy it:

conda create -n mtl python=3.7
conda activate mtl
pip install tqdm Pillow easydict pyyaml imageio scikit-image tensorboard termcolor matplotlib
pip install opencv-python== setuptools==59.5.0

# Example of installing pytorch-1.10.0 
pip install torch==1.10.0+cu111 torchvision==0.11.0+cu111 torchaudio==0.10.0 -f
pip install timm==0.5.4 einops==0.4.1

2. Get data

PASCAL-Context and NYUD-v2

We use the same data (PASCAL-Context and NYUD-v2) as ATRC. You can download the data from: PASCALContext.tar.gz, NYUDv2.tar.gz

And then extract the datasets by:

tar xfvz NYUDv2.tar.gz
tar xfvz PASCALContext.tar.gz

You need to put the datasets into one directory and specify the directory as db_root variable in configs/

3. Train the model

The config files are defined in ./configs.

Edge evaluation code:

Before start training, you need to change the .sh files for different configuation. We use DDP for multi-gpu training by default. You may need to read realted documents before setting the gpu numbers.



:partying_face: Pre-trained models

To faciliate the community to reproduce our SoTA results, we re-train our best performing models with the training code in this repository and provide the weights for the reserachers.

Download pre-trained models

Version Dataset Download Segmentation (mIoU) Human parsing (mIoU) Saliency (maxF) Normals (mErr) Boundary (odsF)
DiffusionMTL (Feature Diffusion) PASCAL-Context (one-label) onedrive 57.16 59.28 78.00 16.17 64.60

:hugs: Cite

Please consider :star2: star our project to share with your community if you find this repository helpful!


  title={DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data},
  author={Ye, Hanrong and Xu, Dan},

:blush: Contact

Please contact Hanrong Ye if any questions.