Closed njordsir closed 4 years ago
You can try to extract VGG features from a fixed input image using both stylegan-encoder
and your own pytorch
version to check whether these two tools give same output.
Also, does the loss descend normally during the optimization procedure?
Original:
Learnt and generated with stylegan-encoder:
Learnt and generated with code above:
The loss does reduce but stabilizes early. The comparison above is with SGD optimizer and learning rate = 1. Other optimizers and lr give similar or worse results.
Maybe this has something to do with differences in optimizer implementations for pytorch and tensorflow/keras and this is just an issue of finding the right hyperparamters to train, but I have had no luck so far.
The loss value from top and bottom figures are clearly different. Can you test whether VGG models from tensorflow/pytorch version give same response to same image? I suggest taking this test as the first step of debugging.
We will support the inversion function in the future version soon. Close this issue for now.
Hi @ShenYujun - is there any indication as to when the inversion function will be made public? We await it with anticipation!
@Voyz Yes, the code will be public for sure. For now, we still have some work in submission, but a more powerful GAN-related toolkit is coming soon!!
@ShenYujun That's absolutely wonderful news, thanks! Out of interest, would you be able to give an approximate release date?
@Voyz We may release the code in March. Thanks for your interest and patience.
@ShenYujun Thank you, appreciate the reply. We truly admire your work, massive kudos for what you've achieved so far! Looking forward to seeing more!
I am trying to derive latent encodings for cutom faces, as done in https://github.com/Puzer/stylegan-encoder.
Here are the details after porting the same to pytorch:
print(m_vgg)
As done by Puzer, I select the [conv->conv->pool->conv->conv->pool->conv->conv->conv] section of the vgg network for feature extraction.
Pre-computing the features for the reference image:
Optimization:
The latent encoding and subsequent generated images are of a poor quality. The results are nowhere near as crisp as that by Puzer.
What I have tried:
What could be wrong:
Any help with the above would be much appreciated.