What parameters need to be adjusted for training IP-Adapter-FaceID-PlusV2 compared to IP-Adapter-FaceID-Plus？

tencent-ailab / IP-Adapter

The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.

Apache License 2.0

5.32k stars 337 forks source link

What parameters need to be adjusted for training IP-Adapter-FaceID-PlusV2 compared to IP-Adapter-FaceID-Plus？ #437

Open SpadgerBoy opened 4 weeks ago

SpadgerBoy commented 4 weeks ago

I have completed the modification of the IP-Adapter-FaceID-Plus training code, but I would like to train an IP-Adapter-FaceID-PlusV2. Note that V2 has an additional step 'out=x+scale * out' compared to V1. What adjustments do I need to make while training V2.

SpadgerBoy commented 4 weeks ago

Moreover, I have noticed https://github.com/tencent-ailab/IP-Adapter/wiki/IP%E2%80%90Adapter%E2%80%90Face .

"At the same time, during the training process, we dropped 50% of the CLIP embedding. Specifically, we used drop path training strategy." I want to know how these need to be implemented.