THU-LYJ-Lab / dmt

[TPAMI 2024] Official PyTorch implementation of A Diffusion Model Translator for Efficient Image-to-Image Translation
MIT License
11 stars 2 forks source link

DMT — Official PyTorch implementation

A Diffusion Model Translator for Efficient Image-to-Image Translation (TPAMI 2024)
Mengfei Xia, Yu Zhou, Ran Yi, Yong-Jin Liu, Wenping Wang

[Paper]

Abstract: Applying diffusion models to image-to-image translation (I2I) has recently received increasing attention due to its practical applications. Previous attempts inject information from the source image into each denoising step for an iterative refinement, thus resulting in a time-consuming implementation. We propose an efficient method that equips a diffusion model with a lightweight translator, dubbed a Diffusion Model Translator (DMT), to accomplish I2I. Specifically, we first offer theoretical justification that in employing the pioneering DDPM work for the I2I task, it is both feasible and sufficient to transfer the distribution from one domain to another only at some intermediate step. We further observe that the translation performance highly depends on the chosen timestep for domain transfer, and therefore propose a practical strategy to automatically select an appropriate timestep for a given task. We evaluate our approach on a range of I2I applications, including image stylization, image colorization, segmentation to image, and sketch to image, to validate its efficacy and general utility. The comparisons show that our DMT surpasses existing methods in both quality and efficiency.

Installation

This repository is developed based on TSIT, where you can find more detailed instructions on installation. We replace the version of torch to support diffusion models. We summarize the necessary steps to facilitate reproduction.

  1. Environment: CUDA version == 11.1.

  2. Install package requirements with conda:

    conda create -n dmt python=3.8  # create virtual environment with Python 3.8
    conda activate dmt
    pip install torch==1.9.1+cu111 torchvision==0.10.1+cu111 --extra-index-url https://download.pytorch.org/whl/cu111
    pip install -r requirements.txt -f https://download.pytorch.org/whl/cu111/torch_stable.html
    pip install protobuf==3.20
    pip install absl-py einops ftfy==6.1.1 
  3. Copy dmt_utils to the TSIT folder

    cp -r dmt_utils TSIT/

Inference and Training

For a quick start, we have provided example test script and train script for colorization task using TSIT and ADM. One can easily modify the preset timesteps for DMT using argument --timestep_s and --timestep_t. Please check the scripts for more details.

Customization

To customize DMT on other datasets or backbones is quite simple.

References

If you find the code useful for your research, please consider citing

@article{xia2024dmt,
  title={A Diffusion Model Translator for Efficient Image-to-Image Translation},
  author={Xia, Mengfei and Zhou, Yu and Yi, Ran and Liu, Yong-Jin and Wang, Wenping},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)},
  year={2024},
}

LICENSE

The project is under MIT License, and is for research purpose ONLY.

Acknowledgements

We highly appreciate TSIT and ADM for their contributions to the community.