Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis

Hirokazu-Narui commented 7 years ago

https://arxiv.org/pdf/1704.04086.pdf

msrks commented 7 years ago

参考: http://createwith.ai/paper/20170418/596

msrks commented 7 years ago

In all our experiments, we empirically set the α = 10−3,λ1 = 0.3,λ2 = 10−3,λ3 = 3 × 10−3 and λ4 = 10−4.

ハイパーパラメータチューニングすげー大変だったろうな・・

msrks commented 7 years ago

In the global pathway, the bottleneck layer, which is the output of GθEg , is usually used for classification task [38] with the cross-entropy loss Lcross entropy.

global network の encoderを分類器だとして、横の顔を判定してる？でいいのかな

msrks commented 7 years ago

local pathway の出力（顔のパーツ）を合体させてるところのやり方がイマイチわかんない

msrks commented 7 years ago

最後の tables が local pathway, global pathway の NNの構造について非常に参考になる。

dassima commented 7 years ago

has anyone found some code for it?

Hirokazu-Narui commented 7 years ago

Sorry, we don't have any code for it. This repository is just a kind of reading article group.

jeromerony commented 7 years ago

I've contacted the authors of this article : they were actually presenting it at ICCV last week and plan on releasing the code soon

dassima commented 7 years ago

This is awesome! Thanks for letting us know.

dassima commented 6 years ago

Latope2-150, can you, please, tell me how have you contacted the authors of the article? I have some questions and I hope they can help me.

jeromerony commented 6 years ago

Hi,

I contacted them in mid october using the email address : huangrui@cmu.edu The student working on this project told me he would release the code and pretrained model in early November at first, and then in late November/early December in the end.

I've contacted him again in early December but did not receive any answer sadly.

However I've decided to code it myself and got a somewhat working model however I don't have access to the Multi-PIE database and the database I used was much smaller so i didn't obtain as impressive results as they did.

2018-01-16 9:19 GMT-05:00 diafatu notifications@github.com:

Latope2-150, can you, please, tell me how have you contacted the authors of the article? I have some questions and I hope they can help me.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/furukawa-ai/deeplearning_papers/issues/5#issuecomment-357974064, or mute the thread https://github.com/notifications/unsubscribe-auth/AS4vj_PdQERv3Fv3fKKXVO6stCF2k-Coks5tLK_4gaJpZM4NDnHt .

dassima commented 6 years ago

Hi, I understand and thank you for the e-mail. Congratulations on having some results. I have tried myself to implement the network, but I didn't managed to make it work properly for several reasons. First, the databases I have are either small or they don't have pairs of images. Secondly, I don't really know how to properly implement the loss. Thirdly, I don't have background in neural networks more than just some trainings on mnist gans, but I was ambitious enough to start with this :)).

Pe 17 ian. 2018 23:59, "JeromeR" notifications@github.com a scris:

Hi,

I contacted them in mid october using the email address : huangrui@cmu.edu The student working on this project told me he would release the code and pretrained model in early November at first, and then in late November/early December in the end.

I've contacted him again in early December but did not receive any answer sadly.

However I've decided to code it myself and got a somewhat working model however I don't have access to the Multi-PIE database and the database I used was much smaller so i didn't obtain as impressive results as they did.

2018-01-16 9:19 GMT-05:00 diafatu notifications@github.com:

Latope2-150, can you, please, tell me how have you contacted the authors of the article? I have some questions and I hope they can help me.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/furukawa-ai/deeplearning_papers/issues/ 5#issuecomment-357974064, or mute the thread https://github.com/notifications/unsubscribe-auth/AS4vj_ PdQERv3Fv3fKKXVO6stCF2k-Coks5tLK_4gaJpZM4NDnHt .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/furukawa-ai/deeplearning_papers/issues/5#issuecomment-358462177, or mute the thread https://github.com/notifications/unsubscribe-auth/AfDr0HnTpTak56zbj1ff4ESgKumOzS9eks5tLm1KgaJpZM4NDnHt .

dassima commented 6 years ago

Would you be interested in giving me some advice or maybe sharing some of your work?

jeromerony commented 6 years ago

Hi,

I can definitely share some code and answer your questions if you have any. My code is written in Pytorch and is dependent on the folder organization.

2018-01-19 0:12 GMT-05:00 diafatu notifications@github.com:

Would you be interested in giving me some advice or maybe sharing some of your work?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/furukawa-ai/deeplearning_papers/issues/5#issuecomment-358868636, or mute the thread https://github.com/notifications/unsubscribe-auth/AS4vj0CpROVNpPjDf2tTq9dQv-mR9AQlks5tMCRNgaJpZM4NDnHt .

dassima commented 6 years ago

What database have you used? How have you implemented the loss? I am using Keras. It seems difficult to implement the loss properly as it is in the paper.

jeromerony commented 6 years ago

So for the database, I have created mine using FEI Face (http://fei.edu.br/~cet/facedatabase.html) and Color Feret (https://www.nist.gov/itl/iad/image-group/color-feret-database) which was a painful and inaccurate process as I had to use a landmark detector to find the location of the face, they eyes, the nose and the mouth.

The loss was a tricky part as well but I imagine is more easily implemented in PyTorch:

the L1 distance should be fairly easy but is done for the whole synthesized frontal face and the local patches as well
the symmetry loss is basically a L1 distance between the right (vertically flipped) and left halves (respectively input and target if I remember correctly) of the synthesized image
the adversarial loss is the same as a in a regular GAN
the identity preserving loss is the tricky part as you must use a Light-CNN (https://arxiv.org/abs/1511.02683) pre-trained model or any other DNN that can "reliably" create a representation of the face. The part where I'm having doubts is that in the original article, they say that they perform this loss between the synthesized image and the original image (with pose) but using the frontal image as reference would be much more accurate imo, maybe I got that part wrong
the total variation regularization is the difference between the image and the image shifted by one pixel to the left (or right) and also between the image and the image shifted by one pixel down. This page has a great explanation as well as code in pytorch https://towardsdatascience.com/pytorch-implementation-of-perceptual-losses-for-real-time-style-transfer-8d608e2e9902

Now, this is the loss as I understood it but it might not be exact and some details are missing in the paper like the regularization coefficient for example

dassima commented 6 years ago

Thank you for all the information. I think I will try your idea of database. I also found yesterday LFW Fuel from MIT for Keras..The pairs of images are not perfect because some are not from the same person, but it is automatically done. https://github.com/dribnet/lfw_fuel

HRLTY commented 6 years ago

Hi all, please see the released code and testing images at https://github.com/HRLTY/TP-GAN. Thank you all for your interest and inspiring discussion. If you have any question, please reach me at huangrui@cmu.edu or huangruiwizard@gmail.com

furukawa-ai / deeplearning_papers

Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis #5