tencent-ailab / IP-Adapter

The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
Apache License 2.0
4.48k stars 293 forks source link

Question about training IP-Adapter-FaceID-PlusV2 #328

Closed vim-hjk closed 4 weeks ago

vim-hjk commented 3 months ago

https://github.com/tencent-ailab/IP-Adapter/wiki/IP%E2%80%90Adapter%E2%80%90Face

According to the link above, it is important to crop the face image at different ratio of the face image, such as full body or upper body, for training. However, the example code is exactly the cropping method aligned around the face(InsightFace norm_crop).

My question is, ID embeddings are obtained using InsightFace norm_crop method on face images, and do you feed face images crop at various ratios in actual IP-Adapter training?

masaisai111 commented 2 months ago

Where did you read about this training, and why is it not visible on the author's github

xiaohu2015 commented 2 months ago

https://github.com/tencent-ailab/IP-Adapter/wiki/IP%E2%80%90Adapter%E2%80%90Face

According to the link above, it is important to crop the face image at different ratio of the face image, such as full body or upper body, for training. However, the example code is exactly the cropping method aligned around the face(InsightFace norm_crop).

My question is, ID embeddings are obtained using InsightFace norm_crop method on face images, and do you feed face images crop at various ratios in actual IP-Adapter training?

hi,ID embeddings is precalculated offline (using InsightFace norm_crop method on face images)

vim-hjk commented 4 weeks ago

Where did you read about this training, and why is it not visible on the author's github

You can find it on the Wiki tab of this repository.

vim-hjk commented 4 weeks ago

https://github.com/tencent-ailab/IP-Adapter/wiki/IP%E2%80%90Adapter%E2%80%90Face According to the link above, it is important to crop the face image at different ratio of the face image, such as full body or upper body, for training. However, the example code is exactly the cropping method aligned around the face(InsightFace norm_crop). My question is, ID embeddings are obtained using InsightFace norm_crop method on face images, and do you feed face images crop at various ratios in actual IP-Adapter training?

hi,ID embeddings is precalculated offline (using InsightFace norm_crop method on face images)

Thank you for your answer