primepake / wav2lip_288x288

MIT License
524 stars 135 forks source link

A little question #105

Closed angel-yi closed 5 months ago

angel-yi commented 6 months ago

Hello, thank you very much for sharing the project I pulled the latest code and trained it according to your training steps I found that The current size of the image is 384, not 288 The syncnet and wav2lip models have v2 versions, and I couldn't find any relevant documentation to explain the differences between these two models. I am training HQ Wav2lip Sam_ When training. py, the import model was modified and it was found that it could not be trained The image I trained according to 384 has many different colored light spots on the edge or lip area of the image, but the overall resolution is relatively high and realistic hope image

kunyao2015 commented 6 months ago

same question

einsqing commented 6 months ago

How long have you been training?How effective is tooth restoration? @angel-yi

angel-yi commented 6 months ago

您训练多久了?牙齿修复效果如何?@angel-yi 我训练了几个小时,我的数据集不大,看到有光点效果就停了,牙齿就是下面这个效果 image

angel-yi commented 6 months ago

How long have you been training?How effective is tooth restoration? @angel-yi

I trained for a few hours, and my dataset is not large. When I saw the spot effect, it stopped. The teeth are like the following effect

shahidmuneer commented 6 months ago

You need a bit higher dataset and train the network until you get fake and real loss, for me after 30 thousand iterations. I am not getting the above spots.

angel-yi commented 6 months ago

You need a bit higher dataset and train the network until you get fake and real loss, for me after 30 thousand iterations. I am not getting the above spots.

Thank you, I am already increasing the size of the dataset and hope to improve this situation

ghost commented 6 months ago

you should FID metric to monitor which is best model and I think you should continue train when the loss is sideways

lililuya commented 6 months ago

Hi, I am a little confused about the size of 288 and 384, the train_syncnet_sam.py imports the "SyncNet_color_384", which is from the syncnet.py module. But the hq _wav2lip_sam_train.py imports the "SyncNet_color",which is from the syncnetv2.py module. I compare the two network and it's different. Is there i miss something important? I am looking for your reply! Thanks!

ghost commented 6 months ago

Hi you can check the syncnet should be 384, I'm too lazy to update so all the training should be in 384

angel-yi commented 6 months ago

you should FID metric to monitor which is best model and I think you should continue train when the loss is sideways

Thank you, I am constantly trying to train

huangxin168 commented 5 months ago

Hi, I am a little confused about the size of 288 and 384, the train_syncnet_sam.py imports the "SyncNet_color_384", which is from the syncnet.py module. But the hq _wav2lip_sam_train.py imports the "SyncNet_color",which is from the syncnetv2.py module. I compare the two network and it's different. Is there i miss something important? I am looking for your reply! Thanks!

I also confused on this.

  1. SyncNet_color_384 from the syncnet.py module
  2. SyncNet_color from the syncnetv2.py module 1 and 2 are different modules

so in file hq _wav2lip_sam_train.py, should I modify this line: from models import SyncNet_color as SyncNet to from models import SyncNet_color_384 as SyncNet

1059692261 commented 5 months ago

you should FID metric to monitor which is best model and I think you should continue train when the loss is sideways

Thank you, I am constantly trying to train

Have you achieve any success in training? I succeed in training syncnet, the loss has decreased to 0.25. However in wav2lip training, the sync loss remains at a large value around 4, and the sample images show that model just copy the mouth part of the reference image to the predict one.

lililuya commented 5 months ago

@huangxin168 Yes, u're right!

kike-0304 commented 4 months ago

你应该使用 FID 指标来监控哪个是最好的模型,我认为当损失横向时你应该继续训练

谢谢,我一直在努力训练

您在培训中取得了成功吗?我成功训练了syncnet,损失减少到了0.25。然而,在 wav2lip 训练中,同步损失仍然保持在 4 左右的较大值,并且样本图像显示模型只是将参考图像的嘴部复制到预测图像。

you should FID metric to monitor which is best model and I think you should continue train when the loss is sideways

Thank you, I am constantly trying to train

Have you achieve any success in training? I succeed in training syncnet, the loss has decreased to 0.25. However in wav2lip training, the sync loss remains at a large value around 4, and the sample images show that model just copy the mouth part of the reference image to the predict one.

I have the same problem. After training, the inferred image is the same as the original image, and it seems that the audio did not play a role in driving the mouth.