TencentARC / MasaCtrl

[ICCV 2023] Consistent Image Synthesis and Editing
https://ljzycmd.github.io/projects/MasaCtrl/
Apache License 2.0
685 stars 25 forks source link

Image Structure Preservation #20

Open Ibtisam-Mohammad opened 1 year ago

Ibtisam-Mohammad commented 1 year ago

Hello, I really liked the work and the control it allows of the image as I was looking for a similar work. I was able to reproduce the results in the paper, but when I tried to use the same procedure with the image of a person using T2I Adapter, it produced results that did not have a good match with the input image. I used the following code, but is there anything in the code that should be changed (S,L etc) that may help in getting a better structure preservation:

python masactrl_w_adapter.py --which_cond sketch --cond_path_src man_real.png --cond_path man_sketch.png --cond_inp_type image --prompt_src "Photo of a man" --prompt "Photo of a fat man" --sd_ckpt models/sd-v1-4.ckpt --resize_short_edge 512 --cond_tau 1.0 --cond_weight 1.0 --n_samples 1 --adapter_ckpt models/t2iadapter_sketch_sd14v1.pth Input (man_real.png): image Input target sketch (man_sketch.png): image

Output: image

ljzycmd commented 1 year ago

Sorry for the late reply! If you want to edit the real image, you may invert the image into the noise map with DDIM inversion, then apply MasaCtrl to synthesize the desired image.

I will upload the code for real image editing with T2I-Adapter.

Ibtisam-Mohammad commented 1 year ago

Thank you for your reply and the great work, looking forward to it. Meanwhile, I shall try what you specified

kunalgoyal9 commented 1 year ago

@ljzycmd Kudos for the Great Work! Is there any estimate for the above feautre. :)

ljzycmd commented 1 year ago

Hi @ALL, the code of real image editing with T2I-Adapter has been uploaded.

black_shirt_example man2

The usage is slightly cumbersome since it requires modifying the code of the T2I-Adapter. To make it easier to use, I will submit a pull request of the modified codes (DDIM inversion) to the original T2I-Adapter repo.

Ibtisam-Mohammad commented 1 year ago

Thank you for uploading it, I also had worked on it based on your instructions, although it was working, it was giving high contrast images and some modifications to the images. I hope your code would resolve it

ljzycmd commented 1 year ago

Thank you for uploading it, I also had worked on it based on your instructions, although it was working, it was giving high contrast images and some modifications to the images. I hope your code would resolve it

Hi, @Ibtisam-Mohammad, I further added some editing examples, and hope they can help you.

Ibtisam-Mohammad commented 1 year ago

Hello @ljzycmd, I tried the example (cat) from the paper using the settings recommended, but it is not giving the appropriate results in that case. Could you guide me in this regard. Thank you. Following settings were used:

python masactrl_w_adapter.py --src_img_path PATH_TO_REAL_IMAGE_OF_CAT --cond_path PATH_TO_TARGET_CANNY_IMAGE_OF_CAT --cond_inp_type image --prompt_src "" --prompt "A realistic photo of a sitting cat, camera view, masterpiece, best quality" --sd_ckpt models/sd-v1-4.ckpt --resize_short_edge 512 --cond_tau 1.0 --cond_weight 1.0 --n_samples 1 --which_cond canny --adapter_ckpt models/t2iadapter_canny_sd14v1.pth REAL_IMAGE_OF_CAT: cat_source TARGET_CANNY_IMAGE_OF_CAT: cat_canny RESULT: image

I am adding few more test examples in the following drive: https://drive.google.com/drive/folders/1nyIi8C7ePyufiImz2ax0q0al7TqT8PUR?usp=sharing

ljzycmd commented 1 year ago

Hi @Ibtisam-Mohammad, this is caused by the DDIM inversion. To alleviate this problem, you can utilize the intermediate latent during DDIM inversion for content querying. You can refer to the code: https://github.com/TencentARC/MasaCtrl/blob/7f7b72dd12d4a3ba8c02800e31a69d0ccb5a4473/masactrl/diffuser_utils.py#L170

Hi @Ibtisam-Mohammad, I further show the editing results by querying intermediate results: cat_sitting

Ibtisam-Mohammad commented 1 year ago

Thank you for the quick reply. I shall look into it.