Closed aishoot closed 2 years ago
this is failed training, gans sometimes hard to train. you need to deepdive in your dataset
this is failed training, gans sometimes hard to train. you need to deepdive in your dataset
Any suggestions? Sincerely hope to get your reply.
process your dataset carefully, maybe this is a problem
Thanks. In fact, there is more than one person on my dataset.
it should be more than 60 minutes/person
I modified some data preprocessing codes. My training dataset is like this:
Try to play with learning rate (disc and model optimizers)
@NikitaKononov What should I do?What about the details. Thanks
@NikitaKononov What should I do?What about the details. Thanks
You should change the learning rate :))
@NikitaKononov What should I do?What about the details. Thanks
You should change the learning rate :))
I changed the syncnet_lr to 1e-3 using hq_wav2lip_train.py or wloss_hq_wav2lip_train.py, which still failed to train.
@NikitaKononov What should I do?What about the details. Thanks
You should change the learning rate :))
I changed the syncnet_lr to 1e-3 using hq_wav2lip_train.py or wloss_hq_wav2lip_train.py, which still failed to train.
Lower
it's not about your learning rate, it's about dataset. Your should has more than 10 persons to train
it's not about your learning rate, it's about dataset. Your should has more than 10 persons to train
Learning rate makes sense too On a large dataset with wring LR model failed to train in my case
it's not about your learning rate, it's about dataset. Your should has more than 10 persons to train
Learning rate makes sense too On a large dataset with wring LR model failed to train in my case
if your dataset is good, It take more than to get good result. if not it will stuck at some point
it's not about your learning rate, it's about dataset. Your should has more than 10 persons to train
Learning rate makes sense too On a large dataset with wring LR model failed to train in my case
if your dataset is good, It take more than to get good result. if not it will stuck at some point
Gan model is sensitive to learning rate, especially in wgan case
that's why we need gradient penalty
I use lrs2 datasets to train, also get Percep: 0.0 | Fake: 100.00000762939453, Real: 0.0 at the 3rd epoch, what should I do @primepake
I use lrs2 datasets to train, also get Percep: 0.0 | Fake: 100.00000762939453, Real: 0.0 at the 3rd epoch, what should I do @primepake
I don't recommend using lrs2, it has crappy resolution, but it won't lead to train failing I think
What code do you use? if wloss_hq_wav2lip_train.py don't even try, it won't work if hq_wav2lip_train.py try different learning rates
I use lrs2 datasets to train, also get Percep: 0.0 | Fake: 100.00000762939453, Real: 0.0 at the 3rd epoch, what should I do @primepake
I don't recommend using lrs2, it has crappy resolution, but it won't lead to train failing I think
What code do you use? if wloss_hq_wav2lip_train.py don't even try, it won't work if hq_wav2lip_train.py try different learning rates
@NikitaKononov Any recommended learning rates?
@aishoot Depends on your batch size, Larger batch - higer LR Try to start with 1e-4 I tried LR from 1e-3 to 1e-7 (used 1e-7 in the last training stage to prevent overfitting)
@NikitaKononov Thanks. I will have a try.
Had the same problem, did you finally solve it?
my hq_wav2lip_train.py always got this result: Percep: 0.6985121191 | Fake: 0.6934193585, Real: 0.6940246867 Is it normal?
@primepake Hello, thanks for your nice work. I have encountered some difficulties in training on my own dataset (followed your data preparation suggestions) using your sharing code recently. While I do
python hq_wav2lip_train.py
, training log is:use_cuda: True Load 2687 audio feats. use_cuda: True, MULTI_GPU: True total trainable params 48520755 total DISC trainable params 18210561 Load checkpoint from: checkpoint_syn/checkpoint_step000171000.pth Starting Epoch: 0 Saved checkpoint: checkpoint_step000000001.pth Saved checkpoint: disc_checkpoint_step000000001.pth L1: 0.2313518226146698, Sync: 0.0, Percep: 0.711134135723114 | Fake: 0.6754781603813171, Real: 0.711134135723114 L1: 0.21765484660863876, Sync: 0.0, Percep: 0.709110289812088 | Fake: 0.677438884973526, Real: 0.709110289812088 L1: 0.2188651313384374, Sync: 0.0, Percep: 0.7070923844973246 | Fake: 0.6794042587280273, Real: 0.7070923844973246 L1: 0.2144552432000637, Sync: 0.0, Percep: 0.7049891352653503 | Fake: 0.6814645230770111, Real: 0.7049891352653503 L1: 0.21138261258602142, Sync: 0.0, Percep: 0.7029658198356629 | Fake: 0.6834565877914429, Real: 0.702965784072876 L1: 0.20817621052265167, Sync: 0.0, Percep: 0.7010621925195059 | Fake: 0.6853393216927847, Real: 0.7010620832443237 L1: 0.20210737415722438, Sync: 0.0, Percep: 0.6996434501239231 | Fake: 0.6867433360644749, Real: 0.6996431180409023 L1: 0.19812600128352642, Sync: 0.0, Percep: 0.6987411752343178 | Fake: 0.6876341179013252, Real: 0.6987397372722626 L1: 0.19437309437327915, Sync: 0.0, Percep: 0.6981025603082445 | Fake: 0.6882637408044603, Real: 0.6980936461024814 ... L1: 0.12470868316135908, Sync: 0.0, Percep: 0.703860961136065 | Fake: 0.7219482924593122, Real: 0.69152611988952 L1: 0.12432418142755826, Sync: 0.0, Percep: 0.7042167019098997 | Fake: 0.7212010175765803, Real: 0.6919726772157446 L1: 0.1240154696033173, Sync: 0.0, Percep: 0.7046547470633516 | Fake: 0.7203877433827243, Real: 0.6924839418497868 L1: 0.12360538116523198, Sync: 0.0, Percep: 0.7050436321569948 | Fake: 0.7196274266711303, Real: 0.6929315188026521 L1: 0.12324049579675751, Sync: 0.0, Percep: 0.7055533739051434 | Fake: 0.7187670722904832, Real: 0.6934557074776173 Evaluating for 300 steps L1: 0.08894559927284718, Sync: 7.633765455087026, Percep: 0.8102987110614777 | Fake: 0.5884036968151728, Real: 0.7851572235425314 L1: 0.12352411426603795, Sync: 0.0, Percep: 0.7059701490402222 | Fake: 0.7179982627183199, Real: 0.6938216637895676 L1: 0.12316158214713087, Sync: 0.0, Percep: 0.7069740667201505 | Fake: 0.7167386501879975, Real: 0.6947587119988258 L1: 0.12274648358716685, Sync: 0.0, Percep: 0.7086389707584008 | Fake: 0.7149891035960001, Real: 0.6961143705702852 L1: 0.122441600682666, Sync: 0.0, Percep: 0.7101659479650478 | Fake: 0.7133496456499239, Real: 0.6971959297446922 L1: 0.12237166498716061, Sync: 0.0, Percep: 0.7136020838068082 | Fake: 0.7105454275957667, Real: 0.6988249019290939 L1: 0.12226668624650865, Sync: 0.0, Percep: 0.7165913383165995 | Fake: 0.7080283074861481, Real: 0.6995955426850179 ... L1: 0.10978432702055822, Sync: 0.0, Percep: 0.7658024462613563 | Fake: 0.8152413980393286, Real: 0.6154363644432882 L1: 0.10972586760557995, Sync: 0.0, Percep: 0.7692448822380323 | Fake: 0.8124342786812339, Real: 0.6124296340577328 L1: 0.10953026241862897, Sync: 0.0, Percep: 0.7743397405519289 | Fake: 0.8092221762695191, Real: 0.610396875484025 L1: 0.10939421248741639, Sync: 0.0, Percep: 0.7824273446941963 | Fake: 0.8055855450166116, Real: 0.6093728619629893 L1: 0.1091919630309757, Sync: 0.0, Percep: 0.7908333182386574 | Fake: 0.8019456527194126, Real: 0.6076706493660488 L1: 0.10912049508790679, Sync: 0.0, Percep: 0.7942769879668473 | Fake: 0.7993013052296446, Real: 0.6045908347347031 L1: 0.10908047136182737, Sync: 0.0, Percep: 0.7925851992034527 | Fake: 0.8665490227044159, Real: 0.6015381057315442 L1: 0.10930284824053846, Sync: 0.0, Percep: 0.7917964988368816 | Fake: 0.8661491764380034, Real: 0.6038172583345233 Evaluating for 300 steps L1: 0.5731244529287021, Sync: 9.090377567211787, Percep: 1.654488068819046 | Fake: 0.21219109917680423, Real: 1.4065884272257487 L1: 0.10955245170742273, Sync: 0.0, Percep: 0.9697534098973847 | Fake: 0.8618184333870491, Real: 0.707599309967045 L1: 0.10959959211782437, Sync: 0.0, Percep: 0.9730155654561438 | Fake: 0.8586212793076295, Real: 0.7107085656626347 L1: 0.1096739404936238, Sync: 0.0, Percep: 0.9733813980183366 | Fake: 0.8565126432325659, Real: 0.7121740724053752 L1: 0.10970133425566951, Sync: 0.0, Percep: 0.9722086560893506 | Fake: 0.8555087234860062, Real: 0.7122941125653687 L1: 0.10975395110161866, Sync: 0.0, Percep: 0.9709689952921027 | Fake: 0.8545878388406093, Real: 0.7123311641033997 L1: 0.10966527403854742, Sync: 0.0, Percep: 0.9697017977636012 | Fake: 0.8537138932196765, Real: 0.7123305465982057 ... L1: 0.10242784435932453, Sync: 0.0, Percep: 0.8304814692491738 | Fake: 10.244699917241833, Real: 0.6199986526972722 L1: 0.10239110164847111, Sync: 0.0, Percep: 0.8279339800796978 | Fake: 10.520022923630663, Real: 0.618096816339305 L1: 0.10235720289591985, Sync: 0.0, Percep: 0.8254020718837354 | Fake: 10.793661997258704, Real: 0.6162066120079922 L1: 0.10230096286480747, Sync: 0.0, Percep: 0.8228856021523825 | Fake: 11.065632539949988, Real: 0.6143279333128459 L1: 0.10232478230738712, Sync: 0.0, Percep: 0.8203844301093661 | Fake: 11.335949766272329, Real: 0.6124606751568799 Starting Epoch: 1 L1: 0.11442705243825912, Sync: 0.0, Percep: 0.0 | Fake: 100.0, Real: 0.0 L1: 0.09390852972865105, Sync: 0.0, Percep: 0.0 | Fake: 100.0, Real: 0.0 L1: 0.09028899172941844, Sync: 0.0, Percep: 0.0 | Fake: 100.0, Real: 0.0 L1: 0.08729504607617855, Sync: 0.0, Percep: 0.0 | Fake: 100.0, Real: 0.0 L1: 0.0875200405716896, Sync: 0.0, Percep: 0.0 | Fake: 100.0, Real: 0.0 L1: 0.08466161414980888, Sync: 0.0, Percep: 0.0 | Fake: 100.0, Real: 0.0 L1: 0.0851562459553991, Sync: 0.0, Percep: 0.0 | Fake: 100.0, Real: 0.0 ... L1: 0.08385955898174599, Sync: 0.0, Percep: 0.0 | Fake: 100.0, Real: 0.0 L1: 0.08394778782830518, Sync: 0.0, Percep: 0.0 | Fake: 100.0, Real: 0.0 L1: 0.08393304471088492, Sync: 0.0, Percep: 0.0 | Fake: 100.0, Real: 0.0 L1: 0.0836903992508139, Sync: 0.0, Percep: 0.0 | Fake: 100.0, Real: 0.0 Evaluating for 300 steps L1: 0.062434629226724304, Sync: 7.282701448599497, Percep: 0.0 | Fake: 100.0, Real: 0.0 L1: 0.08369095140779523, Sync: 0.0, Percep: 0.0 | Fake: 100.0, Real: 0.0 L1: 0.0834334861073229, Sync: 0.0, Percep: 0.0 | Fake: 100.0, Real: 0.0 L1: 0.08353378695167907, Sync: 0.0, Percep: 0.0 | Fake: 100.0, Real: 0.0 L1: 0.08348599619962074, Sync: 0.0, Percep: 0.0 | Fake: 100.0, Real: 0.0 ...
Is the training process normal? If not, could you please give me some suggestions? Need your help sincerely.