Hi, authors. Thanks for your great work.
I have a question about loss computing here https://github.com/fudan-generative-vision/hallo/blob/83dd4ebf52baa27de737045773d4fc4163d7c820/scripts/train_stage2.py#L857
Why do we set reduction=mean when computing loss in train stage 2 ? It seems reductionshould be set to noneinstead of mean following the setting of stage 1. Setting it to meanmakes mse_loss_weights meaningless and total loss will be multiplied by the value of sum(mse_loss_weights)/train_batch_size which is a number less than or equal to 1.
If the followings are correct, this probably results in unstable training process and no convergence.
Hi, authors. Thanks for your great work. I have a question about loss computing here https://github.com/fudan-generative-vision/hallo/blob/83dd4ebf52baa27de737045773d4fc4163d7c820/scripts/train_stage2.py#L857 Why do we set reduction=mean when computing loss in train stage 2 ? It seems
reduction
should be set tonone
instead ofmean
following the setting of stage 1. Setting it tomean
makesmse_loss_weights
meaningless and total loss will be multiplied by the value ofsum(mse_loss_weights)/train_batch_size
which is a number less than or equal to 1.If the followings are correct, this probably results in unstable training process and no convergence.