[Feature Request]: "NULL-text Inversion for Editing Real Images using Guided Diffusion Models" - Yet another, probably better, img2img variant

Ehplodor commented 1 year ago

Is there an existing issue for this?

[X] I have searched the existing issues and checked the recent builds/commits

What would your feature do ?

It is VERY similar to current "img2img alternative Euler script".

It proposes two innovations (in between "") : Original image -> "pivotal DDIM inversion" -> "null text optiumization" -> prompt2prompt editing

The result is obviously very good (see paper). IHDK if results of the "unofficial implementation" by @thepowerfuldeez are as good as thoses presented in the paper.

Just as img2img alt script, once the image is inverted, there is no need to invert again if modifying the new edited prompt (all other things being equal). The initial inversion process might be longer (authors report 1min for inversion, then 10s for each subsequent generations.

Proposed workflow

Go to img2img
select an image to edit
select img2img alt script - adapted to propose various workflows (i.e. diverse inverted samplers, Euler, DDIM, etc...)
tick "pivotal inversion" box
tick "null text optimization" box
enter various other options already present in the alt. script
enter original and edited prompts / negative prompts
Generate

Additional information

unofficial implementation : https://github.com/thepowerfuldeez/null-text-inversion Corresponding research article : https://arxiv.org/abs/2211.09794

loboere commented 1 year ago

official code here []https://github.com/google/prompt-to-prompt

Ehplodor commented 1 year ago

for the sake of completeness and because there seems to be advances on the unofficial side : fork of thepowerfuldeerz unoffical implementation : https://github.com/webaverse/null-text-inversion fork of the fork : https://github.com/JFKraasch/null-text-inversion

Alchete commented 1 year ago

Are you guys actively working on this? I tried a quick&dirty port of Google's code to A1111 as an extension but am ultimately getting the error:

TypeError: register_attention_control..ca_forward..forward() got an unexpected keyword argument 'encoder_hidden_states' inside the NullInversion class:

def get_noise_pred_single(self, latents, t, context):

        noise_pred = self.model.unet(latents, t, encoder_hidden_states=context)["sample"]

        return noise_pred

Any ideas?

Ehplodor commented 1 year ago

Relevant discussion with more links : https://github.com/AUTOMATIC1111/stable-diffusion-webui/discussions/7314

AUTOMATIC1111 / stable-diffusion-webui