google / prompt-to-prompt

Apache License 2.0
2.98k stars 279 forks source link

Bad results? #52

Closed haoli-zbdbc closed 8 months ago

haoli-zbdbc commented 1 year ago

Compare to Imagic, exmple: This is Null text method: prompts = ["A dog", "A sitting dog" ] 图片 This is imagic: 图片 The complete inference code is as follows:

image_path = "./imgs/dog2.png"
prompt = "A dog"
# offsets=(0,0,200,0)
(image_gt, image_enc), x_t, uncond_embeddings = null_inversion.invert(image_path, prompt, verbose=True)

print("Modify or remove offsets according to your image!")
prompts = [prompt]
controller = AttentionStore()
image_inv, x_t = run_and_display(prompts, controller, run_baseline=False, latent=x_t, uncond_embeddings=uncond_embeddings, verbose=False)
print("showing from left to right: the ground truth image, the vq-autoencoder reconstruction, the null-text inverted image")
ptp_utils.view_images([image_gt, image_enc, image_inv[0]])
show_cross_attention(controller, 16, ["up", "down"])
prompts = ["A dog",
           "A sitting dog"
        ]

cross_replace_steps = {'default_': .8, }
self_replace_steps = .7
blend_word = ((('dog',),)) # for local edit
eq_params = {"words": ("sitting", ), "values": (5,)}  

controller = make_controller(prompts, False, cross_replace_steps, self_replace_steps, blend_word, eq_params)
images, _ = run_and_display(prompts, controller, run_baseline=False, latent=x_t, uncond_embeddings=uncond_embeddings)

I tried many parameters but couldn't edit this dog,Is this a limitation of the current method??

YasminZhang commented 10 months ago

I think so! Prompt2Prompt requires the attention mask to be similar, which means the shape and the position of a subject/object should remain almost unchanged. P2P is good for appearance change.