Closed rafaelbou closed 3 years ago
Hi @rafaelbou ,
I'm happy to hear you've enjoyed our work!
What you came across is interesting as we didn't come across something like this during our training.
I see that you've used your own StyleGAN with a resolution 256x256, but you said the outputs of your StyleGAN are sharp so let's assume its not because of the lower resolution StyleGAN.
Other than that, I can see a few potential causes for the effect your seeing:
start_from_latent_avg
to False. We found that starting from the average latent code results in better initialization and better convergence, so try setting this flag to True and see if it helps. id_lambda
to 0. If you're working on face images, we found that this loss is very significant for getting high quality inversion results. If you wish, feel free to send an example of the artifacts you're seeing and maybe this can help to better understand the problem.
Thanks for your answer.
As for the resolution of Stylegan2, you are right, my wording was not clear enough. The images obtained from the PSP come out blurry in relation to the output of the Stylagan2 only. This is reflected in the absence of details characterized by high frequency.
Set start_from_latent_avg to True
When I set start_from_latent_avg
to True, I get the error:
File "/pixel2style2pixel/models/psp.py", line 77, in forward codes = codes + self.latent_avg.repeat(codes.shape[0], 1, 1) AttributeError: 'NoneType' object has no attribute 'repeat'
The source of the error is in the function __load_latent_avg:
def __load_latent_avg(self, ckpt, repeat=None): if 'latent_avg' in ckpt: self.latent_avg = ckpt['latent_avg'].to(self.opts.device) if repeat is not None: self.latent_avg = self.latent_avg.repeat(repeat, 1) else: self.latent_avg = None
The ckpt that I use for stylegan2 generator don't includes latent_avg
parameter, so the function set self.latent_avg = None
.
id loss I'm working on the fingerprint domain so this loss function does not apply to my case. I will try to track your motivation in the paper and produce such a function for my domain.
Output examples pSp output images (left-original input, right-pSp output - 115K steps):
Stylegan2 output:
Thanks again, Rafael.
Interesting! Thanks for the clarifications. Indeed, the error with the average latent code occurs since its not saved in your StyleGAN generator. However, I don't think the average latent code is very meaningful in your domain. If seems that pSp is able to capture most of details of the input image, but struggles when it comes to preserving the fine details.
I will try to track your motivation in the paper and produce such a function for my domain.
I believe that incorporating a loss function that explicitly handles the preservation of fine details will provide the most improvement in your case.
Do you have any idea why the Checkerboard effect was created? Especially due to the fact that the Stylegan2 output has no such phenomenon.
Good question. I'm not sure what exactly could be causing this, but if I had to guess it is because the coarse feature maps outputted by our network are of size 16x16. These feature maps are then passed to the map2style block which down-samples each to a vector of size 512.
I am not sure exactly why the checkboard effect would be caused by this since we did not see anything like this in any of experiments, but I hope that a loss that address the fine details can also address this.
Hi @yuval-alaluf Thanks for sharing (and maintaining) your great work!
After training the model on my own dataset, I came across a situation where the output of the model has unwanted characteristics:
For clarification, after training the stylegan2 model (on which this training is based), the results come out sharp and without the checkerboard effect described above.
Have you experienced any of these things during training? Do you have any idea about the factors that could affect the output in this way? Maybe playing with the weights of the loss function?
More information Training opts:
Training metrics:
Test metrics: