Open Ehplodor opened 1 year ago
official code here []https://github.com/google/prompt-to-prompt
for the sake of completeness and because there seems to be advances on the unofficial side : fork of thepowerfuldeerz unoffical implementation : https://github.com/webaverse/null-text-inversion fork of the fork : https://github.com/JFKraasch/null-text-inversion
Are you guys actively working on this? I tried a quick&dirty port of Google's code to A1111 as an extension but am ultimately getting the error:
TypeError: register_attention_control.
def get_noise_pred_single(self, latents, t, context):
noise_pred = self.model.unet(latents, t, encoder_hidden_states=context)["sample"]
return noise_pred
Any ideas?
Relevant discussion with more links : https://github.com/AUTOMATIC1111/stable-diffusion-webui/discussions/7314
Is there an existing issue for this?
What would your feature do ?
It is VERY similar to current "img2img alternative Euler script".
It proposes two innovations (in between "") : Original image -> "pivotal DDIM inversion" -> "null text optiumization" -> prompt2prompt editing
The result is obviously very good (see paper). IHDK if results of the "unofficial implementation" by @thepowerfuldeez are as good as thoses presented in the paper.
Just as img2img alt script, once the image is inverted, there is no need to invert again if modifying the new edited prompt (all other things being equal). The initial inversion process might be longer (authors report 1min for inversion, then 10s for each subsequent generations.
Proposed workflow
Additional information
unofficial implementation : https://github.com/thepowerfuldeez/null-text-inversion Corresponding research article : https://arxiv.org/abs/2211.09794