AUTOMATIC1111 / stable-diffusion-webui

Stable Diffusion web UI
GNU Affero General Public License v3.0
141.4k stars 26.72k forks source link

Diffusion clip #2485

Open bbecausereasonss opened 2 years ago

bbecausereasonss commented 2 years ago

Have you guys seen this?

This repo includes the official PyTorch implementation of DiffusionCLIP, Text-Guided Diffusion Models for Robust Image Manipulation. DiffusionCLIP resolves the critical issues in zero-shot manipulation with the following contributions.

We revealed that diffusion model is well suited for image manipulation thanks to its nearly perfect inversion capability, which is an important advantage over GAN-based models and hadn't been analyzed in depth before our detailed comparison. Our novel sampling strategies for fine-tuning can preserve perfect reconstruction at increased speed. In terms of empirical results, our method enables accurate in- and out-of-domain manipulation, minimizes unintended changes, and significantly outperformes SOTA baselines. Our method takes another step towards general application by manipulating images from a widely varying ImageNet dataset. Finally, our zero-shot translation between unseen domains and multi-attribute transfer can effectively reduce manual intervention. The training process is illustrated in the following figure. Once the diffusion model is fine-tuned, any image from the pretrained domain can be manipulated into the corresponding to the target text without re-training:

https://github.com/gwang-kim/DiffusionCLIP

jpollard-cs commented 1 year ago

sorry I know this isn't a productive comment, but the way you started this issue reminded me of timmy south park

jpollard-cs commented 1 year ago

but seriously - a more helpful answer.. have you tried this? https://huggingface.co/gwang-kim/DiffusionCLIP-LSUN_Bedroom

I'll give it a shot and report back

jpollard-cs commented 1 year ago

nope not compatible - perhaps that was obvious, but I'm new 'round here