Why encoder is initialized with IR_SE50?

omertov / encoder4editing

Official implementation of "Designing an Encoder for StyleGAN Image Manipulation" (SIGGRAPH 2021) https://arxiv.org/abs/2102.02766

MIT License

945 stars 154 forks source link

Why encoder is initialized with IR_SE50? #58

Closed hmdolatabadi closed 3 years ago

hmdolatabadi commented 3 years ago

Hi,

Thanks for your great work. I have a question regarding encoder initialization: why do you initialize it with IR_SE50? And if I want to modify the encoder architecture, does that mean that I have to pre-train something similar to IR_SE50? Or I can just start from a random initialization?

https://github.com/omertov/encoder4editing/blob/1f27dfc64b1f5567aea5b67018804e7daf961ef6/models/psp.py#L50

Thanks in advance for your help. I appreciate it a lot.

Sincerely.

omertov commented 3 years ago

Hi @hmdolatabadi! We followed pSp on this one, with the motivation being that a pretrained face recognition network extracts meaningfull features to be used as the input for the style codes prediction, especially for the faces (FFHQ) domain. Although such reasoning can be less intuitive for the non faces domain, we have not looked further into the issue.

I would suggest testing a random initialization approach, which I believe will work, but I also believe that a pretrained network will lead to a faster convergence.

Best, Omer