Open torridgristle opened 3 years ago
I am not sure I understand the suggestion, as the algorithm only modifies the latent code, and does not change the network weights or layers. I think what you suggest is to feed the original latent code to some layers, and the "optimized" latent code to other layers.
Yeah, that sounds right. Either feeding it no optimized latent code at all in some layers, or lerping it back to the original latent code to some extent in some layers. The network weights or layers don't need to be changed at all, just the latent code fed to the network.
If I understand correctly, you suggest allowing control on the specific style codes that are optimized. I think it can be implemented in two ways: 1) As @woctezuma suggested, the optimization can be applied only on a subset of style codes. 2) Different lambdas can be applied for each style code separately. I think both are good ideas, and actually, I planned to try them anyway :) Hope I will add it soon
An issue that I'm encountering quite a lot for free_generation is whacky colors. For FFHQ 1024, of the 18 layers the top 6 are basically just global color manipulation judging from my experience with similar 1024px portraits StyleGAN2 models, albeit not FFHQ exactly and instead whatever modification of it is used on Artbreeder with more digital paintings and such in it.
If the topmost layers (or perhaps a user-set list of layers) could be frozen or given something like its own l2_lambda to limit how far it can diverge from the starting position, or how quickly it can diverge rather than distance, perhaps the colors will appear more natural and pleasing, while also forcing StyleCLIP to make changes to the facial content itself rather than wasting steps on things like white balance adjustments, which may be completely unwanted.