It seems the Face-ID adapter is initialized with random weights. Therefore, before training the model is NOT identical to the T2I model and could generate artifacts? I was wondering if you have tried initializing the vision control as zeros, such that it won't affect the T2I generation at the beginning (similar to zero-convs in controlnet)? Thanks!
It seems the Face-ID adapter is initialized with random weights. Therefore, before training the model is NOT identical to the T2I model and could generate artifacts? I was wondering if you have tried initializing the vision control as zeros, such that it won't affect the T2I generation at the beginning (similar to zero-convs in controlnet)? Thanks!