zyainfal / One-Shot-Face-Swapping-on-Megapixels

One Shot Face Swapping on Megapixels.
Other
314 stars 40 forks source link

Failed to face swap with celeb_hq ? #47

Open empty2enrich opened 1 year ago

empty2enrich commented 1 year ago

what caused this? image

zyainfal commented 1 year ago

It seems that the third and fourth faces are coped from the second face. There are two possible reason I could tell: (1) Faces are not processed by the model into latent codes, or (2) Mask is not processed correctly so that the generated swapped face cannot be put on the second face.

For both cases, you could check the middle results in the inference code (comments are marked in #): https://github.com/zyainfal/One-Shot-Face-Swapping-on-Megapixels/blob/1af90db74aeede0bb2f51175c8af957b09e38d6b/inference/inference.py#L119

    def run(self, src_idx, tgt_idx, refine=True):
        src_face_rgb, tgt_face_rgb, tgt_mask = self.read_pair(src_idx, tgt_idx)
        source, target = self.preprocess(src_face_rgb, tgt_face_rgb)
        swapped_face = self.swap(source, target)
        swapped_face = self.postprocess(swapped_face, tgt_face_rgb, tgt_mask) # comment out this line to see if swapped face is generated

        result = np.hstack((src_face_rgb[:,:,::-1], tgt_face_rgb[:,:,::-1], swapped_face))

        if refine:
            swapped_tensor, _ = self.preprocess(swapped_face[:,:,::-1], swapped_face)
            refined_face = self.refine(swapped_tensor)
            refined_face = self.postprocess(refined_face, tgt_face_rgb, tgt_mask)
            result = np.hstack((result, swapped_face))
        cv2.imwrite("{}.jpg".format(self.swap_type), result)

    def swap(self, source, target):
        with torch.no_grad():
            ts = torch.cat([target, source], dim=0).cuda()
            lats, struct = self.encoder(ts)

            idd_lats = lats[1:]
            att_lats = lats[0].unsqueeze_(0)
            att_struct = struct[0].unsqueeze_(0)

            swapped_lats = self.swapper(idd_lats, att_lats) # use "swapped_lats = att_lats " to see if the face can be reconsructed
            fake_swap, _ = self.generator(att_struct, [swapped_lats, None], randomize_noise=False)

            fake_swap_max = torch.max(fake_swap)
            fake_swap_min = torch.min(fake_swap)
            denormed_fake_swap = (fake_swap[0] - fake_swap_min) / (fake_swap_max - fake_swap_min) * 255.0
            fake_swap_numpy = denormed_fake_swap.permute((1, 2, 0)).cpu().numpy()
        return fake_swap_numpy

where the swapped_lats is the latent code for swapped face and tgt_mask is the mask for pasting the swapped face.

empty2enrich commented 1 year ago

thanks for your answer

  1. I read the inference.py code and generated latent code, but I found the code in FaceTransferModule.py, The scale value is large, is there a problem with the weight。scale = torch.sigmoid(self.weight).expand(N, -1, -1)
def forward(self, idd, att):
        if self.type == "ftm":
            att_low = att[:, :self.swap_indice]
            idd_high = idd[:, self.swap_indice:]
            att_high = att[:, self.swap_indice:]

            N = idd.size(0)
            idds = []
            atts = []
            for i in range(self.num_latents):
                new_idd, new_att = self.blocks[i](idd_high[:, i], att_high[:, i])
                idds.append(new_idd)
                atts.append(new_att)
            idds = torch.cat(idds, 1)
            atts = torch.cat(atts, 1)
            scale = torch.sigmoid(self.weight).expand(N, -1, -1)
            print(f'scale: {scale}')
            latents = scale * idds + (1-scale) * atts

            return torch.cat([att_low, latents], 1)
  1. I detected the mask and it is correct (the mask is generated using: https://github.com/zllrunning/face-parsing.PyTorch)
empty2enrich commented 1 year ago

Can you give me a contact?

Using issue to communicate is somewhat inefficient。

zyainfal commented 1 year ago

(1) It's fine that scale is large, as it is learnt to highlight idd instead of att (You can change the scale for generating more samlpes that keep more att if you want, but we set it as a constant in this repo). (2) When you use the mask generated from https://github.com/zllrunning/face-parsing.PyTorch, please note that the mask used in our model is not colored. The mask should be [0,1,2,...,18] encoded, where we recommend you use the official masks provided by CelebA-HQ-Mask. (3) If you have more questions, you may want to give me your e-mail so that I can send you my contact (I don't want my contact to be public ;-) )

empty2enrich commented 1 year ago

(1) It's fine that scale is large, as it is learnt to highlight idd instead of att (You can change the scale for generating more samlpes that keep more att if you want, but we set it as a constant in this repo). (2) When you use the mask generated from https://github.com/zllrunning/face-parsing.PyTorch, please note that the mask used in our model is not colored. The mask should be [0,1,2,...,18] encoded, where we recommend you use the official masks provided by CelebA-HQ-Mask. (3) If you have more questions, you may want to give me your e-mail so that I can send you my contact (I don't want my contact to be public ;-) )

my email: 2056374813@qq.com

empty2enrich commented 1 year ago

(1) It's fine that scale is large, as it is learnt to highlight idd instead of att (You can change the scale for generating more samlpes that keep more att if you want, but we set it as a constant in this repo). (2) When you use the mask generated from https://github.com/zllrunning/face-parsing.PyTorch, please note that the mask used in our model is not colored. The mask should be [0,1,2,...,18] encoded, where we recommend you use the official masks provided by CelebA-HQ-Mask. (3) If you have more questions, you may want to give me your e-mail so that I can send you my contact (I don't want my contact to be public ;-) )

The mask that i use are [0,1,2,...,18] .

zyainfal commented 1 year ago

sent