LonglongaaaGo / EXE-GAN

Facial image inpainting is a task of filling visually realistic and semantically meaningful contents for missing or masked pixels in a face image. This paper presents EXE-GAN, a novel diverse and interactive facial inpainting framework, which can not only preserve the high-quality visual effect of the whole image but also complete the face image with exemplar-like facial attributes.
MIT License
55 stars 3 forks source link

exemplar_style_mixing.py generate only the images in the provided examples #6

Closed dinusha94 closed 1 month ago

dinusha94 commented 4 months ago

Hi, I am excited about your model and tried to use it with my images for style mixing. I have a ground truth image and an example image which are face aligned and 256x256. also have a mask with 256x256. when I ran the following command in colab after all the installation steps, I got the same results mentioned in the ReadME. not for my images.

command I ran is as follows:

%cd /content/EXE-GAN !python exemplar_style_mixing.py \ --psp_checkpoint_path /content/psp_ffhq_encode.pt \ --ckpt /content/EXE_GAN_model.pt \ --masked_dir /content/EXE-GAN/imgs/exe_guided_recovery/mask \ --gt_dir /content/EXE-GAN/imgs/exe_guided_recovery/target \ --exemplar_dir /content/EXE-GAN/imgs/exe_guided_recovery/exemplar \ --sample_times 2 \ --eval_dir /content/EXE-GAN/imgs/exe_guided_recovery/mixing_out

Any help on this would be highly appriciated

LonglongaaaGo commented 4 months ago

Hi @dinusha94 , thank you so much for your attention to our project. It sounds like you may have input the wrong image path for target or exemplar folders.

  1. To solve the above issues, I think you can directly create your own target'',exemplar'', and ``mask'' folders by yourself, and then input these three folder paths. For example, you can create target1, exemplar1, and mask1 folders and put your images into corresponding folders. and also change the command by replacing your new paths.
  2. please also check the output folder.

Let me know if it works or not. Thanks!

dinusha94 commented 4 months ago

Thaks for the response.

I did as you mentioned, but still the I got the same results. following is my python command and its response

command

python exemplar_style_mixing.py --psp_checkpoint_path /content/psp_ffhq_encode.pt --ckpt /content/EXE_GAN_model.pt --masked_dir /content/EXE-GAN/mask1 --gt_dir /content/EXE-GAN/target1 --exemplar_dir /content/EXE-GAN/exemplar1 --sample_times 2 --eval_dir /content/EXE-GAN/out

Terminal output

/content/EXE-GAN
from .fused_act import FusedLeakyReLU, fused_leaky_relu
from .upfirdn2d import upfirdn2d
model name: exe_gan !!!!!!!!!!!!!!!
start_latent :4
n_psp_latent :10
Loading ResNet ArcFace
Loading pSp from checkpoint: /content/psp_ffhq_encode.pt
load models: /content/EXE_GAN_model.pt
  0% 0/1 [00:00<?, ?it/s]/content/EXE-GAN/op/conv2d_gradfix.py:88: UserWarning: conv2d_gradfix not supported on PyTorch 2.2.2+cu118. Falling back to torch.nn.functional.conv2d().
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py:1347: UserWarning: dropout2d: Received a 2-D input to dropout2d, which is deprecated and will result in an error in a future release. To retain the behavior and silence this warning, please use dropout instead. Note that dropout2d exists to provide channel-wise dropout on inputs with 2 spatial dimensions, a channel dimension, and an optional batch dimension (i.e. 3D or 4D inputs).
  warnings.warn(warn_msg)
100% 1/1 [00:02<00:00,  2.85s/it]
LonglongaaaGo commented 4 months ago

Hi @dinusha94

I found the problem here, it is because I hard-coded the three paths. Could you download the exemplar_style_mixing.py again and try again? Thanks!

dinusha94 commented 4 months ago

Hi @LonglongaaaGo

Thanks a lot, it works now. I have two questions though.

I am trying to use a single example image which is generated from Open AI Dalle with a ground truth image

1 . regarding face alignment before style mixing. Is it a must to do the face alignment before using it for every image (or do we need to do it for badly aligned faces)

  1. can we use only one example image
LonglongaaaGo commented 4 months ago

Hi @dinusha94 , Thanks for the good questions. First, face alignment is needed, because the training data (FFHQ) are from aligned images, and the capacity of the GAN-based model is somehow limited. If we try to use poorly aligned faces, the face will be collapsed. Second, we, of course, can use only one exemplar image for inpainting/editing. Could you please download exemplar_style_mixing.py again? You can activate the code at line 212 and comment out the code at line 213.

completedimg, , inferimgs, = generator.get_inherent_stoc(gtimg, mask_01,infer_imgs=exe_img_1)

After running, you would get diverse completed images guided by only one exemplar image.

Thanks!

LonglongaaaGo commented 4 months ago

Hi @dinusha94 , oh, I realized that you could directly use guided_recovery.py. Please take a look at the README for more details. Thanks!

dinusha94 commented 4 months ago

Hi @LonglongaaaGo

Thanks for all the help. I would like to know you're Idea about the results I got with the following images. (I ran the guided_recovery.py script)

My final goal was to change the hairstyle and color of the ground truth image using the exemplar image. But the results I got are a little bit far from what I expected. I would like to know if I would be able to improve these by training with a new dataset.

LonglongaaaGo commented 4 months ago

Hi @dinusha94 , Yes, It is an interesting idea. I think now the pre-trained model could not achieve your goals for several reasons. First, our model is an inpainting method, which is trained to seamlessly fill the masked regions using exemplar facial attributes. So this model has to consider the context of the masked regions, but also think about how to perform reasoning to generate appropriate attributes. Thus, you will find that the color of the hair has to consider the background or context of hair colors. You can try to remove all the hair and fill it again. Second, to guide the hair structure, a sketch-based model is needed. You can follow our paper, to add one channel of image sketch to guide the structure of hair (train a new model). Third, to manipulate the color, you may also include a color channel to control the color generation (train a new model). Recently, our team has been working on that. One paper for multimodal guided image editing is under review. So, if you are not in a hurry, we may release a novel multimodal facial editing method after a couple of months.

Thanks!

dinusha94 commented 4 months ago

That is great to hear. Meantime I can try to train a model with my dataset.

Is adding a channel means adding it to the mask of the training data? (i.e 3 channel mask with replacing the white area with the desired color and extra channel in the mask for sketch, is that correct). Is this the thing mentioned in the section 5.3 in you're paper.

Are there any python script in the github repo which can be used with the sketch ? Since it is shown in the Readme

LonglongaaaGo commented 4 months ago

Hi @dinusha94 ,

Now, we have four channels (3 for an input RGB image and 1 for an input mask). If we add extra sketch (1 channel ) and colors (3 channels), there will be 8 channels.

You may be interested in this project. See if it can help: https://github.com/eladrich/pixel2style2pixel