grey-eye / talking-heads

Our implementation of "Few-Shot Adversarial Learning of Realistic Neural Talking Head Models" (Egor Zakharov et al.)
GNU General Public License v3.0
593 stars 109 forks source link

LossMCH not working #26

Open castelo-software opened 5 years ago

castelo-software commented 5 years ago

It seems that when LossMCH is turned on (FEED_FORWARD=False in config.py), the losses become nan and the network produces black images.

I haven't had time to debug and find out why.

usingcolor commented 5 years ago

Maybe it caused by reshape(-1) before the calculate L1 loss

victormashkov19 commented 5 years ago

@cch98 Removing reshape(-1) have not effect.

castelo-software commented 5 years ago

It's not a problem with reshaping. The core of the problem is that the standard deviation done in the AdaIN layer contains a square root which sometimes gets an invalid input and therefore has a gradient of NaN.

It's not an error directly in the Loss_MCH function as far as I know, but activating it causes a chain reaction that results in the embedded vector to have dangerous values. I tried to fix it by adding relus in the generator, but it hasn't done the trick.

afung-git commented 5 years ago

@MrCaracara,

I have been trying to troubleshoot this issue as well. I was able to trace it back to the values of the e_hat matrix (ln: 102) in run.py. They slowly reduce to zero then nan in each iteration, which causes your adv loss to have a nan value. I am stuck at this point...

danmarbeck commented 5 years ago

For all of you interested, the solution proposed in Issue #32 also works for me, at least for now the training works fine. However, I'm not sure why this seems to work, since Tensor.sqrt() can also yield NaN.

danmarbeck commented 5 years ago

For all of you interested, the solution proposed in Issue #32 also works for me, at least for now the training works fine. However, I'm not sure why this seems to work, since Tensor.sqrt() can also yield NaN.

This does prevent the losses from getting too low, but now my generator gets messed up. I ran the training for the last couple of days, and now all the generator does is to produce plain-colored brown images. I did not observer the training, so I am not sure why this happened.

alexstaf commented 5 years ago

For all of you interested, the solution proposed in Issue #32 also works for me, at least for now the training works fine. However, I'm not sure why this seems to work, since Tensor.sqrt() can also yield NaN.

This does prevent the losses from getting too low, but now my generator gets messed up. I ran the training for the last couple of days, and now all the generator does is to produce plain-colored brown images. I did not observer the training, so I am not sure why this happened.

Faced the same behavior.

danmarbeck commented 5 years ago

For all of you interested, the solution proposed in Issue #32 also works for me, at least for now the training works fine. However, I'm not sure why this seems to work, since Tensor.sqrt() can also yield NaN.

The solution suggested contains a little mistake. The 'self.eps' should be added after taking the square root. If you do it this way, the NaN losses occur as before, which makes sense, since it should be equivalent to using torch.std().

alexstaf commented 5 years ago

It seems that the weight of the MCH loss is 8 times more than in the original paper.

goodmangu commented 4 years ago

Has someone finally able to fix this issue totally?