foivospar / Arc2Face

[ECCV 2024🔥] Arc2Face: A Foundation Model of Human Faces
https://arc2face.github.io/
MIT License
492 stars 32 forks source link

Abalation study with training with the unet frozen #12

Open trideeprath opened 2 months ago

trideeprath commented 2 months ago

Has there been any ablation study to understand the impact of freezing the unet and only training the clip encoder.

My understanding here is that if the unet is fixed this technique could be applied with other merged/lora based unets.

foivospar commented 2 months ago

Hi, the core module that needs tuning is the clip encoder, as it needs to be adapted to ID-embeddings. However, to achieve best results, we fine-tuned both the unet and clip encoder. I assume that some pre-trained LoRAs may still be compatible with our unet (as is the LCM-LoRA for example), but, in general, styled LoRAs that include both unet and clip weights are less likely to work, as our encoder is significantly altered from the original.

trideeprath commented 2 months ago

Is it possible to release the clip encoder with frozen unet understand and compare with approaches like ip-adapter that keep the unet frozen.