tencent-ailab / IP-Adapter

The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
Apache License 2.0
5.19k stars 335 forks source link

Small things needs to be noted when using FaceIDplus #430

Open Sooplex opened 3 weeks ago

Sooplex commented 3 weeks ago

The output img of FaceAnalysis is BGR image,but the codes in IP-Adapter-FaceID and hugging face docs both use it as the input for the CLIP image encoder (it should be an RGB image). It may lead to some promblems like unexpected blue hair. image

image

alexblattner commented 1 week ago

@Sooplex you are correct. This works in general though:

ref_images_embeds = []
ip_adapter_images = []
innerip=cv2.cvtColor(np.array(image), cv2.COLOR_RGB2BGR)
faces=IF.get(innerip)

ip_adapter_images.append(face_align.norm_crop(innerip, landmark=faces[0].kps, image_size=224))
innerip = torch.from_numpy(faces[0].normed_embedding)
ref_images_embeds.append(innerip.unsqueeze(0))
ref_images_embeds = torch.stack(ref_images_embeds, dim=0).unsqueeze(0)
neg_ref_images_embeds = torch.zeros_like(ref_images_embeds)
id_embeds = torch.cat([neg_ref_images_embeds, ref_images_embeds]).to(dtype=torch.float16, device="cuda")