Closed iperov closed 6 years ago
Possibly an easy win. Good idea.
I wonder if the quality of A and B are equally important when going from A->B. Is A_loss = 0.02, B_loss = 0.01 identical in quality (however you measure) to A_loss = 0.01, B_loss = 0.02?
I feel like B_loss is more important for A->B, but I have no evidence to support this guess.
yeah I think so too, B_loss more important, because its affects sharpness of replaced face in result, whereas A_loss affects only correct map of face features.
Come on, this is GAN.
Well, some idea of additional feedback on the network training.
So there could be some optimal weight, like self.loss_A >= K *(self.loss_B), where K is a constant or maybe even a function of loss_B.
Without any further data, I would take a wild guess and set K=1.1 or 1.2 to ensure more time is spent on loss_B.
But...
The only other caveat is if the training data sets have different quality. Maybe loss_B = 0.02 is the best you can get with a poor face B training set, while face A has great training data and can reach 0.01. For example, you have 10000 HD video frames of face A, and just 100 blurry selfies of face B. You wouldn't want to waste time overtraining B in that case. You might as well just max out face A training to get what you can. However, if the loss_B = 0.02 limit means the results look terrible even with loss_A = 0.001, there's no point bothering with any further training. So yeah, in that case, this is a good idea.
I got network overfit with code from main post.
This fix work well:
self.loss_A = 9999.0
self.loss_B = 9999.0
def train_one_step(self, iter, viewer):
if iter % 10 == 0:
epoch, warped_A, target_A = next(self.images_A)
epoch, warped_B, target_B = next(self.images_B)
self.loss_A = self.model.autoencoder_A.train_on_batch(warped_A, target_A)
self.loss_B = self.model.autoencoder_B.train_on_batch(warped_B, target_B)
else:
if self.loss_A >= self.loss_B:
epoch, warped_A, target_A = next(self.images_A)
self.loss_A = self.model.autoencoder_A.train_on_batch(warped_A, target_A)
else:
epoch, warped_B, target_B = next(self.images_B)
self.loss_B = self.model.autoencoder_B.train_on_batch(warped_B, target_B)
print("[{0}] [#{1:05d}] loss_A: {2:.5f}, loss_B: {3:.5f}".format(time.strftime("%H:%M:%S"), iter, self.loss_A, self.loss_B),
end='\r')
if viewer is not None:
epoch, warped_A, target_A = next(self.images_A)
epoch, warped_B, target_B = next(self.images_B)
viewer(self.show_sample(target_A[0:14], target_B[0:14]), "training")
spent night and B now closer to A
@iperov Is there necessary balance loss_B to loss_A, would you please explain more? Is there any example show overfit? How about training time? Thanks.
@modelsex
for example A has 600 similar photos from video source B has 1500 various photos from internet source. Train A->A much faster than B. You can see that A->A became sharper than B->B in preview. So why spend time to train A->A that has enough sharpness?, especially when our goal B->A convert.
Overfit - all predicts become black with red noise.
I usually stop training decoder_A and I even set encoder.trainable = False
when going below 0.03 . But I'm not trying to achieve high quality faceswaps so this may not be a good advice for video swapps, but i agree this saves time...
can this be added as a option?
Can both be added please, iperov's training strategy and Clorr's freezing of chosen encoder. I tried to do it manually but I couldn't get either to work.
@iperov :)
This could also be useful if you want to change out one of the data sets, so for example if you want to change dataset_A to use a different set of pictures from a different video but keep dataset_B the same you would want to concentrate on training the A side more then the B if you are reusing the same model.
I think its better, because destination video has much less frames than source celeb, so training both we waste time, because less trained celeb results blurry face