PanchengZhao / LAKE-RED

[CVPR 2024] LAKE-RED: Camouflaged Images Generation by Latent Background Knowledge Retrieval-Augmented Diffusion.
Apache License 2.0
30 stars 3 forks source link

LAKE-RED: Camouflaged Images Generation by Latent Background Knowledge Retrieval-Augmented Diffusion

--- [**Pancheng Zhao**](https://www.zhaopancheng.top)1,2 · [**Peng Xu**](https://www.pengxu.net/)3+ · **Pengda Qin**4 · [**Deng-Ping Fan**](https://dengpingfan.github.io/)2,1 · [**Zhicheng Zhang**](https://zzcheng.top/)1,2 · [**Guoli Jia**](https://exped1230.github.io/)1 · [**Bowen Zhou**](http://web.ee.tsinghua.edu.cn/zhoubowen/zh_CN/index.htm)3 · [**Jufeng Yang**](https://cv.nankai.edu.cn/)1,2 1 VCIP & TMCC & DISSec, College of Computer Science, Nankai University 2 Nankai International Advanced Research Institute (SHENZHEN· FUTIAN) 3 Department of Electronic Engineering, Tsinghua University · 4Alibaba Group +corresponding authors **CVPR 2024** Paper PDF Project Page Project Page

1. News

2. Get Start

1. Requirements

If you already have the ldm environment, please skip it

A suitable conda environment named ldm can be created and activated with:

conda env create -f ldm/environment.yaml
conda activate ldm

2. Download Datasets and Checkpoints.

Datasets:

We collected and organized the dataset LAKERED from existing datasets. The training set is from COD10K and CAMO, and testing set is including three subsets: Camouflaged Objects (CO), Salient Objects (SO), and General Objects (GO).

Datasets GoogleDrive BaiduNetdisk(v245)
Results:

The results of this paper can be downloaded at the following link:

Results GoogleDrive BaiduNetdisk(berx)
Checkpoint:

The Pre-trained Latent-Diffusion-Inpainting Model

Pretrained Autoencoding Models Link
Pretrained LDM Link

Put them into specified path:

Pretrained Autoencoding Models: ldm/models/first_stage_models/vq-f4-noattn/model.ckpt
Pretrained LDM: ldm/models/ldm/inpainting_big/last.ckpt

The Pre-trained LAKERED Model

LAKERED GoogleDrive BaiduNetdisk(dzi8)

Put it into specified path:

LAKERED: ckpt/LAKERED.ckpt

3. Quick Demo:

You can quickly experience the model with the following commands:

sh demo.sh

4. Train

4.1 Combine the codebook with Pretrained LDM
python combine.py
4.2 Start Train

You can change the `config_LAKERED.yaml' files to modify settings.

sh train.sh

Note:The solution to the KeyError 'global_step'

Quick fix : You can --resume with the model that is saved during termination from error. (logs/checkpoints/last.ckpt)

You can also skip 4.1 and download the LAKERED_init.ckpt to start training.

5. Test

Generate camouflage images with foreground objects in the test set:

sh test.sh

Note that this will take a lot of time, you can download the results.

6. Eval

Use torch-fidelity to calculate FID and KID:

pip install torch-fidelity

You need to specify the result root and the data root, then eval it by running:

sh eval.sh

For the “RuntimeError: stack expects each tensor to be equal size”

This is due to inconsistent image sizes.

Debug by following these steps:

​ (1) Find the datasets.py in the torch-fidelity

anaconda3/envs/envs-name/lib/python3.8/site-packages/torch_fidelity/datasets.py

​ (2) Import torchvision.transforms

import torchvision.transforms as TF

​ (3) Revise line 24:

self.transforms = TF.Compose([TF.Resize((299,299)),TransformPILtoRGBTensor()]) if transforms is None else transforms

Or you can manually modify the size of the images to be the same.

Contact

If you have any questions, please feel free to contact me:

zhaopancheng@mail.nankai.edu.cn

pc.zhao99@gmail.com

Citation

If you find this project useful, please consider citing:

@inproceedings{zhao2024camouflaged,
      author = {Zhao, Pancheng and Xu, Peng and Qin, Pengda and Fan, Deng-Ping and Zhang, Zhicheng and Jia, Guoli and Zhou, Bowen and Yang, Jufeng},
      title = {LAKE-RED: Camouflaged Images Generation by Latent Background Knowledge Retrieval-Augmented Diffusion},
      booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
      year = {2024},
}

Acknowledgements

This code borrows heavily from latent-diffusion-inpainting, thanks the contribution of nickyisadog