Open Ibtisam-Mohammad opened 1 year ago
Sorry for the late reply! If you want to edit the real image, you may invert the image into the noise map with DDIM inversion, then apply MasaCtrl to synthesize the desired image.
I will upload the code for real image editing with T2I-Adapter.
Thank you for your reply and the great work, looking forward to it. Meanwhile, I shall try what you specified
@ljzycmd Kudos for the Great Work! Is there any estimate for the above feautre. :)
Hi @ALL, the code of real image editing with T2I-Adapter has been uploaded.
The usage is slightly cumbersome since it requires modifying the code of the T2I-Adapter. To make it easier to use, I will submit a pull request of the modified codes (DDIM inversion) to the original T2I-Adapter repo.
Thank you for uploading it, I also had worked on it based on your instructions, although it was working, it was giving high contrast images and some modifications to the images. I hope your code would resolve it
Thank you for uploading it, I also had worked on it based on your instructions, although it was working, it was giving high contrast images and some modifications to the images. I hope your code would resolve it
Hi, @Ibtisam-Mohammad, I further added some editing examples, and hope they can help you.
Hello @ljzycmd, I tried the example (cat) from the paper using the settings recommended, but it is not giving the appropriate results in that case. Could you guide me in this regard. Thank you. Following settings were used:
python masactrl_w_adapter.py --src_img_path PATH_TO_REAL_IMAGE_OF_CAT --cond_path PATH_TO_TARGET_CANNY_IMAGE_OF_CAT --cond_inp_type image --prompt_src "" --prompt "A realistic photo of a sitting cat, camera view, masterpiece, best quality" --sd_ckpt models/sd-v1-4.ckpt --resize_short_edge 512 --cond_tau 1.0 --cond_weight 1.0 --n_samples 1 --which_cond canny --adapter_ckpt models/t2iadapter_canny_sd14v1.pth
REAL_IMAGE_OF_CAT:
TARGET_CANNY_IMAGE_OF_CAT:
RESULT:
I am adding few more test examples in the following drive: https://drive.google.com/drive/folders/1nyIi8C7ePyufiImz2ax0q0al7TqT8PUR?usp=sharing
Hi @Ibtisam-Mohammad, this is caused by the DDIM inversion. To alleviate this problem, you can utilize the intermediate latent during DDIM inversion for content querying. You can refer to the code: https://github.com/TencentARC/MasaCtrl/blob/7f7b72dd12d4a3ba8c02800e31a69d0ccb5a4473/masactrl/diffuser_utils.py#L170
Hi @Ibtisam-Mohammad, I further show the editing results by querying intermediate results:
Thank you for the quick reply. I shall look into it.
Hello, I really liked the work and the control it allows of the image as I was looking for a similar work. I was able to reproduce the results in the paper, but when I tried to use the same procedure with the image of a person using T2I Adapter, it produced results that did not have a good match with the input image. I used the following code, but is there anything in the code that should be changed (S,L etc) that may help in getting a better structure preservation:
python masactrl_w_adapter.py --which_cond sketch --cond_path_src man_real.png --cond_path man_sketch.png --cond_inp_type image --prompt_src "Photo of a man" --prompt "Photo of a fat man" --sd_ckpt models/sd-v1-4.ckpt --resize_short_edge 512 --cond_tau 1.0 --cond_weight 1.0 --n_samples 1 --adapter_ckpt models/t2iadapter_sketch_sd14v1.pth
Input (man_real.png):Output:![image](https://github.com/TencentARC/MasaCtrl/assets/63063432/eed0e026-34e4-4d5d-bee0-896274008bc8)