Closed GLivshits closed 3 years ago
To get a better idea of your configuration, I have several questions:
1) Images from GAN: Command: python scripts/train_restyle_psp.py --encoder_type 'BackboneEncoder' --input_nc 2 --output_size 256 --learning_rate 0.0002 --batch_size 6 --lpips_lambda 0.8 --id_lambda 0.8 --l2_lambda 4 --w_norm_lambda 0.001 2) I've trained a Lucidrains Stylegan2 version and using it now. Adopted its code to your restyler. It generates the same images as before reworking. 3) I see an improvement only maybe 500 iters, then loss just fluctuates on the same high level. 4) Adaptation for grayscale, added encoder from https://arxiv.org/pdf/2104.07661.pdf, added discriminator loss and discr_lambda. Some modifications to visualization.
So a couple of things:
ir_se50
model. Basically, in all of our other domains we use a ResNet34 backbone pre-trained on ImageNet, which I believe is what you want in your case. id_loss
but this is designed specifically for faces. You should instead use the MoCo-based loss. More details on setting up the proper parameters can be found here: https://github.com/yuval-alaluf/restyle-encoder#additional-notesw_norm_loss
. You should set its lambda
value to 0. Hope this helps. I do believe that you should be able to get pretty good results on this domain. The challenge is capturing the finer details regarding in an accurate reconstruction, but I do believe you should be able to get close.
So, I've found out that regularizing the W vector is very important, because under many additions it's norm just blows up. I'm using a network for fingerprint recognition instead of face. But the thing is that the pattern of generated images does not change a lot with more iterations. Network just tries to modify it's shape in order to minimize L2 loss.
Hello again. Im trying your code (except I've chosed lucidrains GAN) to invert fingerprints (toy project, my first working GAN, publically available data). GAN works nice, but when trying your code Im getting only the shape of fingerprint right, whereas the pattern is completely non-natural. Max iterations I've tried were 25k. I use single-channel image (slight modifications to your code). Example attached. What can you suggest to improve the quality? I use l2, discriminator from GAN and LPIPS.