Closed taozhiqi closed 4 months ago
Thanks for your interested! This is my mistake because at the begin I used to range [-1,1]. I fixed that change to sigmoid and normalized input of vgg from range [0,1] -> [-1,1]
https://github.com/primepake/wav2lip_288x288/commit/84be153a25572e8cc76dbab2b19c2bd8f34c7adb
hi , thanks for the work again, in the training code(the hq_wav2lip_sam_train.py file), i have some questions (1) the value of the image label is 0-1, but the last op of the generate network is nn.tanh, the output is [-1, 1], it does not match, why not use nn.sigmoid?
(2) the result of the generate is [0, 1] ,but the input of lpips demands [-1, 1], the link is https://github.com/richzhang/PerceptualSimilarity, the code is
we should modified the output of the generate to [-1, 1]? (3) if i use the patch gan Discriminator, it will generate the artifact as below, the input of the Discriminator is [-1, 1] or [0. 1]?![frame_4 (10)](https://github.com/primepake/wav2lip_288x288/assets/8148510/2930e1c6-4ff4-45a2-bed2-a9afc9b0aede)
(4) the most strange question, i found is that the result of the generate seems to construct the down half of the ref image, not the label , even the loss is l1loss(gt, g) ?
look forward to your reply, thanks