baldFemale / beautyGAN-tf-Implement

100 stars 21 forks source link

Train problems #2

Open vincentwei0919 opened 5 years ago

vincentwei0919 commented 5 years ago

Hi, I really appreciate your excellent work! I've tried to train the model for my personal use with all settings not changed. But generally I found that when the training finished, most of the original images can be recovered, but the fake images especially for those from non-makeup to makeup ones always not as good as your paper had pasted. I check the super parameters used for your code again and found that the weight face loss(contained in makeup loss) is not 0.1 but equally used with lip/eye loss, so is this may be the reason? Or if your spectral norm matters? what if I stopped using this kind of norm? I've tried to change the weights for each loss, but not so useful, if there is a principal or range for these weights? Or just experimental values? Waiting for your reply! Thank you again!

baldFemale commented 5 years ago

One big difference between the original paper and my code is that I simply use face parsing tool in dlib which cannot provides an accurate face mask but includes other parts such as hair or background. It explains part of unnatural face performance. The paper applies more detailed face parsing algorithm.

Addition to this problem, super parameters do matter. Because the original paper does not reveal many details, I can only change the weights experimentally based on my own normalization for each loss. Decreasing the weight of face loss and lip loss as well as increasing the cycle consistency loss leads to relatively better performance.

Theoretically spectral norm stables the training process but I haven't trained the model without spectral normalization. Instance norm should also work.

One more interesting finding. Non-makeup imgs showed in the original paper always have more appealing results than others.

vincentwei0919 commented 5 years ago

Thank you for you reply! So I noticed that your folders like "smokey_after_rotate" containing many images, what's that for? I've searched on github for face parsing algorithms afterwards,but got no better one. If there is a demo you've got? By the way, I've tried used 2 gpus for training, when set save_training_images=True,there was a little bug in the function "save_training_images",it seemed I need to stack two images for input like this: for i in range(0,10): real_input_A = np.stack([self.A_input[i][0], self.A_input[i][0]], axis=0) # added real_input_B = np.stack([self.B_input[i][0], self.B_input[i][0]], axis=0) #added fake_A_temp,fake_B_temp,cyc_A_temp,cyc_B_temp = sess.run([self.fake_A,self.fake_B,self.cyc_A,self.cyc_B],feed_dict={ self.input_A_multigpu:real_input_A, self.input_B_multigpu:real_input_B }) I don't really sure about this, but it really works. So you may check for this problem. Waiting for your reply! Thank you!

baldFemale commented 5 years ago

Smokey_after_rotate fold and Japanese_after_rotate are my initial datasets.

I'm also trapped by the face parsing algorithm. I haven't tried other tools so far.

I think you're right about the bug in multi gpus situation. Stacking gpu_num images should work. Thanks for your notes.

huynx8888 commented 5 years ago

I am not the author. But, I saw in the main.py file.

load_dir = "imgs.txt"

I don't know what is the meaning of imgs.txt could you explain me what is the meaning?