Open 2502128021 opened 3 years ago
Hi, the facial details sometimes may not keep due to the influence of the data distribution. There are two probable solutions as following:
Here is the testing result in my training stage, I‘ll try your advice!
otherwise, I've tested your pretrained model, when I give the image and it's corresponding mask as input, your model can not recover the origin image, it may be improved by your advice 1, but as the missing details is almost everywhere, U can not replace the artifacts completely, that is saying, your model can not preserve the id information well, thus we can not edit the real image, what about your advice for improving? or maybe I used it in a wrong manner?
I think you need to check the label IDs of the parsing label. Maybe the label IDs used in training are different from testing. The problem may happen in the data loader.
the pic I show is the result in my testing set, my training set and testing set are split from dataset CelebAMask-HQ. They are made in following steps: 1.run g_mask.py to make the one-channel mask which value is in range of [0, 18]; 2.make 19-channels mask according the value in one-channel mask, the value of each channel is in range of [0, 1]; 3.split the dataset into training set and testing set, as a matter of face, I have only 36 testing pics shown in the result pic above. as u can see, hair region is quite weired, and face details are missing. so they are maybe not different, but I'll check my dataloader again since that is processed in TF tensor. In addition, my lsgan loss seems not converge, but it make sense since the result is quite different from real img. Have u met these problems in your training process? I'll try your 2th advice first in hair region to see whether the result will get improved. thx!
In my experience, the blurry hair problem is caused by the process of normalizing (to the range of [0, 1]) and denormalizing (to the range of [0, 255]) the parsing label. The label IDs would change when using the wrong method on denormalizing. You can follow the method used in face_parsing
.
I‘ve found the cause for the bad result: the GAN loss’ weight is too small to work, when I increase the weight of GAN, the result become sharp and realistic.
U've ever suggested "Add local region loss (separate local region by masks in the training stage) on the eye part or the skin part." , how to add? I'am doubt that direct L1 loss of resultsource_mask and targettarget_mask may introduce incorrect spatial mapping relationship since the source_mask will change in the inference.
Furthermore, I think it is evitable to lose some ID information in your algorithm, the probable cause in my opinion is the global average pool in your Style Feature Transfer layer, It obviously cause a great loss of origin information. Do U think so? or U have other methods to keep the ID of origin pic?(keep id is an essential requirement in face manipulation in my opinion)
I reproduce the MaskGAN in tensorflow,and train it, but the result is a little bit weired, espically the hair region and face details like iris color, freckle, ..., etc. Have U met these problems, in your training stage, what is the wrong U suppose in my project?