AliaksandrSiarohin / first-order-model

This repository contains the source code for the paper First Order Motion Model for Image Animation
https://aliaksandrsiarohin.github.io/first-order-model-website/
MIT License
14.3k stars 3.18k forks source link

need some help for training #344

Open Qiulin-W opened 3 years ago

Qiulin-W commented 3 years ago

Great work! I'm trying to train the model on Voxceleb2 dataset and the generated frames are not as good as those generated by your provided pretrained model. During training the generator GAN loss is increasing monotonously while the discriminator GAN loss is decreasing monotonously, though the perceptual loss is decreasing. Is It normal for the loss to be like this? Could you please share your "log.txt" file of the pretrained model as a reference? My log file is as follows:

""" 00000000) perceptual - 144.01511; gen_gan - 0.43809; feature_matching - 2.58715; equivariance_value - 0.32446; equivariance_jacobian - 0.56901; disc_gan - 0.38356 00000001) perceptual - 132.22182; gen_gan - 0.50871; feature_matching - 2.58335; equivariance_value - 0.27795; equivariance_jacobian - 0.53301; disc_gan - 0.32559 00000002) perceptual - 128.32959; gen_gan - 0.45090; feature_matching - 1.96871; equivariance_value - 0.26547; equivariance_jacobian - 0.52600; disc_gan - 0.36615 00000003) perceptual - 126.16486; gen_gan - 0.50853; feature_matching - 2.15793; equivariance_value - 0.26450; equivariance_jacobian - 0.52474; disc_gan - 0.32417 00000004) perceptual - 124.39825; gen_gan - 0.52194; feature_matching - 2.22876; equivariance_value - 0.26078; equivariance_jacobian - 0.51910; disc_gan - 0.31644 00000005) perceptual - 123.16610; gen_gan - 0.52708; feature_matching - 2.26770; equivariance_value - 0.29057; equivariance_jacobian - 0.51785; disc_gan - 0.31409 00000006) perceptual - 122.10067; gen_gan - 0.53207; feature_matching - 2.31882; equivariance_value - 0.28342; equivariance_jacobian - 0.51394; disc_gan - 0.31077 00000007) perceptual - 121.15230; gen_gan - 0.53350; feature_matching - 2.34858; equivariance_value - 0.26545; equivariance_jacobian - 0.50420; disc_gan - 0.30995 00000008) perceptual - 120.54834; gen_gan - 0.54565; feature_matching - 2.40739; equivariance_value - 0.27044; equivariance_jacobian - 0.50579; disc_gan - 0.29916 00000009) perceptual - 119.89876; gen_gan - 0.56212; feature_matching - 2.49280; equivariance_value - 0.27071; equivariance_jacobian - 0.50382; disc_gan - 0.28790 00000010) perceptual - 119.30991; gen_gan - 0.50842; feature_matching - 2.25055; equivariance_value - 0.27950; equivariance_jacobian - 0.50425; disc_gan - 0.32586 00000011) perceptual - 118.59067; gen_gan - 0.56309; feature_matching - 2.48912; equivariance_value - 0.26821; equivariance_jacobian - 0.49907; disc_gan - 0.29396 00000012) perceptual - 118.25211; gen_gan - 0.57068; feature_matching - 2.51491; equivariance_value - 0.26555; equivariance_jacobian - 0.49522; disc_gan - 0.27774 00000013) perceptual - 117.69729; gen_gan - 0.56670; feature_matching - 2.52620; equivariance_value - 0.25799; equivariance_jacobian - 0.48955; disc_gan - 0.28719 00000014) perceptual - 117.22231; gen_gan - 0.58315; feature_matching - 2.58854; equivariance_value - 0.26786; equivariance_jacobian - 0.48785; disc_gan - 0.27408 00000015) perceptual - 116.64049; gen_gan - 0.58734; feature_matching - 2.59196; equivariance_value - 0.26109; equivariance_jacobian - 0.48693; disc_gan - 0.27264 00000016) perceptual - 116.37921; gen_gan - 0.59342; feature_matching - 2.61044; equivariance_value - 0.26371; equivariance_jacobian - 0.48323; disc_gan - 0.26800 00000017) perceptual - 116.50377; gen_gan - 0.60122; feature_matching - 2.62840; equivariance_value - 0.25327; equivariance_jacobian - 0.48092; disc_gan - 0.26449 00000018) perceptual - 115.46024; gen_gan - 0.51649; feature_matching - 2.33525; equivariance_value - 0.25032; equivariance_jacobian - 0.47674; disc_gan - 0.31814 00000019) perceptual - 115.39281; gen_gan - 0.59547; feature_matching - 2.60871; equivariance_value - 0.25836; equivariance_jacobian - 0.47748; disc_gan - 0.26436 00000020) perceptual - 114.97450; gen_gan - 0.59987; feature_matching - 2.63207; equivariance_value - 0.25992; equivariance_jacobian - 0.47654; disc_gan - 0.26337 00000021) perceptual - 114.82450; gen_gan - 0.54983; feature_matching - 2.38938; equivariance_value - 0.26447; equivariance_jacobian - 0.47658; disc_gan - 0.30630 00000022) perceptual - 114.58286; gen_gan - 0.60296; feature_matching - 2.59098; equivariance_value - 0.27131; equivariance_jacobian - 0.47551; disc_gan - 0.25760 00000023) perceptual - 114.18736; gen_gan - 0.60433; feature_matching - 2.62299; equivariance_value - 0.27514; equivariance_jacobian - 0.47204; disc_gan - 0.26020 00000024) perceptual - 113.73436; gen_gan - 0.60714; feature_matching - 2.62591; equivariance_value - 0.24795; equivariance_jacobian - 0.46776; disc_gan - 0.25939 00000025) perceptual - 113.88875; gen_gan - 0.61120; feature_matching - 2.63779; equivariance_value - 0.24223; equivariance_jacobian - 0.46537; disc_gan - 0.25620 00000026) perceptual - 115.47273; gen_gan - 0.61377; feature_matching - 2.66905; equivariance_value - 0.29483; equivariance_jacobian - 0.50396; disc_gan - 0.25030 00000027) perceptual - 114.38964; gen_gan - 0.60818; feature_matching - 2.62184; equivariance_value - 0.27410; equivariance_jacobian - 0.49143; disc_gan - 0.25718 00000028) perceptual - 114.15324; gen_gan - 0.61537; feature_matching - 2.63447; equivariance_value - 0.26522; equivariance_jacobian - 0.49028; disc_gan - 0.25101 00000029) perceptual - 113.38180; gen_gan - 0.54406; feature_matching - 2.35786; equivariance_value - 0.25409; equivariance_jacobian - 0.48403; disc_gan - 0.30375 00000030) perceptual - 113.74049; gen_gan - 0.62198; feature_matching - 2.62155; equivariance_value - 0.27206; equivariance_jacobian - 0.48722; disc_gan - 0.24737 00000031) perceptual - 113.15704; gen_gan - 0.61700; feature_matching - 2.58942; equivariance_value - 0.26470; equivariance_jacobian - 0.48003; disc_gan - 0.26475 00000032) perceptual - 112.79678; gen_gan - 0.58813; feature_matching - 2.45876; equivariance_value - 0.26249; equivariance_jacobian - 0.48224; disc_gan - 0.26244 00000033) perceptual - 112.54894; gen_gan - 0.61021; feature_matching - 2.60720; equivariance_value - 0.26647; equivariance_jacobian - 0.47912; disc_gan - 0.25152 00000034) perceptual - 112.35883; gen_gan - 0.58499; feature_matching - 2.48258; equivariance_value - 0.26278; equivariance_jacobian - 0.47817; disc_gan - 0.27273 00000035) perceptual - 112.30501; gen_gan - 0.62632; feature_matching - 2.63735; equivariance_value - 0.26403; equivariance_jacobian - 0.47741; disc_gan - 0.24229 00000036) perceptual - 112.06609; gen_gan - 0.62931; feature_matching - 2.64004; equivariance_value - 0.25877; equivariance_jacobian - 0.47176; disc_gan - 0.24207 00000037) perceptual - 111.63947; gen_gan - 0.63011; feature_matching - 2.63209; equivariance_value - 0.26837; equivariance_jacobian - 0.47456; disc_gan - 0.24309 00000038) perceptual - 111.91910; gen_gan - 0.63215; feature_matching - 2.62367; equivariance_value - 0.31219; equivariance_jacobian - 0.48208; disc_gan - 0.24152 00000039) perceptual - 111.45773; gen_gan - 0.63571; feature_matching - 2.63479; equivariance_value - 0.27914; equivariance_jacobian - 0.47553; disc_gan - 0.24013 00000040) perceptual - 111.47609; gen_gan - 0.63828; feature_matching - 2.63517; equivariance_value - 0.27251; equivariance_jacobian - 0.47602; disc_gan - 0.23776 00000041) perceptual - 111.21346; gen_gan - 0.64116; feature_matching - 2.64167; equivariance_value - 0.27332; equivariance_jacobian - 0.47271; disc_gan - 0.23639 00000042) perceptual - 110.84853; gen_gan - 0.64118; feature_matching - 2.63665; equivariance_value - 0.27198; equivariance_jacobian - 0.47045; disc_gan - 0.23622 00000043) perceptual - 110.70772; gen_gan - 0.64144; feature_matching - 2.62633; equivariance_value - 0.25620; equivariance_jacobian - 0.46695; disc_gan - 0.23654 00000044) perceptual - 110.71648; gen_gan - 0.64452; feature_matching - 2.64110; equivariance_value - 0.26906; equivariance_jacobian - 0.47035; disc_gan - 0.23408 00000045) perceptual - 110.67635; gen_gan - 0.64682; feature_matching - 2.64891; equivariance_value - 0.26187; equivariance_jacobian - 0.46944; disc_gan - 0.23299 00000046) perceptual - 110.49648; gen_gan - 0.64679; feature_matching - 2.64437; equivariance_value - 0.27280; equivariance_jacobian - 0.47060; disc_gan - 0.23373 00000047) perceptual - 110.20433; gen_gan - 0.64854; feature_matching - 2.65047; equivariance_value - 0.26823; equivariance_jacobian - 0.47093; disc_gan - 0.23151 00000048) perceptual - 110.11427; gen_gan - 0.64943; feature_matching - 2.63666; equivariance_value - 0.26756; equivariance_jacobian - 0.46780; disc_gan - 0.23334 00000049) perceptual - 109.96450; gen_gan - 0.64982; feature_matching - 2.65083; equivariance_value - 0.26899; equivariance_jacobian - 0.46534; disc_gan - 0.23012 00000050) perceptual - 112.91872; gen_gan - 0.59436; feature_matching - 2.40596; equivariance_value - 0.31774; equivariance_jacobian - 0.52214; disc_gan - 0.27601 00000051) perceptual - 110.99445; gen_gan - 0.65840; feature_matching - 2.66536; equivariance_value - 0.32202; equivariance_jacobian - 0.51237; disc_gan - 0.22198 00000052) perceptual - 109.99225; gen_gan - 0.65393; feature_matching - 2.64877; equivariance_value - 0.29682; equivariance_jacobian - 0.48540; disc_gan - 0.22773 00000053) perceptual - 109.59476; gen_gan - 0.65419; feature_matching - 2.65084; equivariance_value - 0.28380; equivariance_jacobian - 0.47519; disc_gan - 0.22757 00000054) perceptual - 109.55179; gen_gan - 0.65610; feature_matching - 2.65613; equivariance_value - 0.27699; equivariance_jacobian - 0.46700; disc_gan - 0.22757 00000055) perceptual - 109.01971; gen_gan - 0.44292; feature_matching - 1.76206; equivariance_value - 0.27986; equivariance_jacobian - 0.46824; disc_gan - 0.39758 00000056) perceptual - 108.88816; gen_gan - 0.57086; feature_matching - 2.31087; equivariance_value - 0.27048; equivariance_jacobian - 0.46502; disc_gan - 0.27204 00000057) perceptual - 109.07840; gen_gan - 0.66007; feature_matching - 2.65673; equivariance_value - 0.26150; equivariance_jacobian - 0.46097; disc_gan - 0.22055 00000058) perceptual - 108.86227; gen_gan - 0.65660; feature_matching - 2.65772; equivariance_value - 0.26590; equivariance_jacobian - 0.46256; disc_gan - 0.22332 00000059) perceptual - 108.81279; gen_gan - 0.65604; feature_matching - 2.65810; equivariance_value - 0.27779; equivariance_jacobian - 0.46569; disc_gan - 0.22660 00000060) perceptual - 108.91225; gen_gan - 0.65691; feature_matching - 2.66648; equivariance_value - 0.26651; equivariance_jacobian - 0.46113; disc_gan - 0.22473 00000061) perceptual - 108.47410; gen_gan - 0.61638; feature_matching - 2.46586; equivariance_value - 0.26966; equivariance_jacobian - 0.46378; disc_gan - 0.26120 00000062) perceptual - 108.63198; gen_gan - 0.66010; feature_matching - 2.66168; equivariance_value - 0.28372; equivariance_jacobian - 0.46553; disc_gan - 0.22112 00000063) perceptual - 108.40616; gen_gan - 0.65781; feature_matching - 2.66430; equivariance_value - 0.27218; equivariance_jacobian - 0.46157; disc_gan - 0.22466 00000064) perceptual - 108.37229; gen_gan - 0.66099; feature_matching - 2.66950; equivariance_value - 0.26734; equivariance_jacobian - 0.45852; disc_gan - 0.22744 00000065) perceptual - 108.22717; gen_gan - 0.64755; feature_matching - 2.60427; equivariance_value - 0.26792; equivariance_jacobian - 0.45941; disc_gan - 0.23552 00000066) perceptual - 108.50709; gen_gan - 0.66137; feature_matching - 2.66978; equivariance_value - 0.28559; equivariance_jacobian - 0.46901; disc_gan - 0.21915 00000067) perceptual - 108.22466; gen_gan - 0.66338; feature_matching - 2.67635; equivariance_value - 0.27765; equivariance_jacobian - 0.46143; disc_gan - 0.22133 00000068) perceptual - 108.10513; gen_gan - 0.66464; feature_matching - 2.68130; equivariance_value - 0.27633; equivariance_jacobian - 0.45784; disc_gan - 0.22050 00000069) perceptual - 107.92184; gen_gan - 0.66334; feature_matching - 2.67615; equivariance_value - 0.28541; equivariance_jacobian - 0.46111; disc_gan - 0.22144 00000070) perceptual - 107.70759; gen_gan - 0.66365; feature_matching - 2.67411; equivariance_value - 0.27495; equivariance_jacobian - 0.45446; disc_gan - 0.22138 00000071) perceptual - 107.97046; gen_gan - 0.66758; feature_matching - 2.68970; equivariance_value - 0.28353; equivariance_jacobian - 0.45722; disc_gan - 0.21849 00000072) perceptual - 107.71601; gen_gan - 0.66960; feature_matching - 2.69050; equivariance_value - 0.29166; equivariance_jacobian - 0.45885; disc_gan - 0.21786 00000073) perceptual - 107.55038; gen_gan - 0.67021; feature_matching - 2.69405; equivariance_value - 0.27940; equivariance_jacobian - 0.45521; disc_gan - 0.21613 00000074) perceptual - 107.59147; gen_gan - 0.67082; feature_matching - 2.69565; equivariance_value - 0.28626; equivariance_jacobian - 0.45418; disc_gan - 0.21690 00000075) perceptual - 107.49586; gen_gan - 0.67299; feature_matching - 2.69488; equivariance_value - 0.28648; equivariance_jacobian - 0.45647; disc_gan - 0.21653 00000076) perceptual - 107.34782; gen_gan - 0.67315; feature_matching - 2.70130; equivariance_value - 0.29859; equivariance_jacobian - 0.45487; disc_gan - 0.21471 00000077) perceptual - 107.38080; gen_gan - 0.67404; feature_matching - 2.69913; equivariance_value - 0.32222; equivariance_jacobian - 0.46480; disc_gan - 0.21534 00000078) perceptual - 107.28460; gen_gan - 0.67600; feature_matching - 2.70855; equivariance_value - 0.31403; equivariance_jacobian - 0.46049; disc_gan - 0.21282 00000079) perceptual - 107.00259; gen_gan - 0.67600; feature_matching - 2.70842; equivariance_value - 0.29015; equivariance_jacobian - 0.45173; disc_gan - 0.21465 00000080) perceptual - 107.00052; gen_gan - 0.67678; feature_matching - 2.71506; equivariance_value - 0.28980; equivariance_jacobian - 0.45338; disc_gan - 0.21142 00000081) perceptual - 106.79164; gen_gan - 0.64375; feature_matching - 2.54156; equivariance_value - 0.28551; equivariance_jacobian - 0.45208; disc_gan - 0.24309 00000082) perceptual - 106.92809; gen_gan - 0.67779; feature_matching - 2.71245; equivariance_value - 0.29116; equivariance_jacobian - 0.45202; disc_gan - 0.20908 00000083) perceptual - 106.97665; gen_gan - 0.67852; feature_matching - 2.71942; equivariance_value - 0.29957; equivariance_jacobian - 0.45394; disc_gan - 0.21107 00000084) perceptual - 106.85758; gen_gan - 0.67969; feature_matching - 2.70653; equivariance_value - 0.27912; equivariance_jacobian - 0.44791; disc_gan - 0.21200 00000085) perceptual - 106.93966; gen_gan - 0.67884; feature_matching - 2.71779; equivariance_value - 0.29603; equivariance_jacobian - 0.45350; disc_gan - 0.21145 00000086) perceptual - 106.86576; gen_gan - 0.68185; feature_matching - 2.72516; equivariance_value - 0.30453; equivariance_jacobian - 0.45073; disc_gan - 0.20938 00000087) perceptual - 106.23597; gen_gan - 0.47466; feature_matching - 1.81239; equivariance_value - 0.29261; equivariance_jacobian - 0.45300; disc_gan - 0.36669 00000088) perceptual - 106.39493; gen_gan - 0.68489; feature_matching - 2.66625; equivariance_value - 0.28099; equivariance_jacobian - 0.44432; disc_gan - 0.20205 00000089) perceptual - 106.31949; gen_gan - 0.68084; feature_matching - 2.71962; equivariance_value - 0.28987; equivariance_jacobian - 0.45054; disc_gan - 0.20758 00000090) perceptual - 106.49042; gen_gan - 0.68170; feature_matching - 2.71902; equivariance_value - 0.29881; equivariance_jacobian - 0.45215; disc_gan - 0.20991 00000091) perceptual - 106.28352; gen_gan - 0.68198; feature_matching - 2.72413; equivariance_value - 0.29038; equivariance_jacobian - 0.44966; disc_gan - 0.20846 00000092) perceptual - 106.09528; gen_gan - 0.68138; feature_matching - 2.72018; equivariance_value - 0.27638; equivariance_jacobian - 0.44445; disc_gan - 0.21000 00000093) perceptual - 106.34380; gen_gan - 0.68503; feature_matching - 2.73288; equivariance_value - 0.30186; equivariance_jacobian - 0.44893; disc_gan - 0.20720 00000094) perceptual - 106.03691; gen_gan - 0.67085; feature_matching - 2.66210; equivariance_value - 0.27290; equivariance_jacobian - 0.44171; disc_gan - 0.22255 00000095) perceptual - 106.10703; gen_gan - 0.68509; feature_matching - 2.73309; equivariance_value - 0.27407; equivariance_jacobian - 0.44335; disc_gan - 0.20634 00000096) perceptual - 106.27911; gen_gan - 0.68658; feature_matching - 2.74002; equivariance_value - 0.28087; equivariance_jacobian - 0.44271; disc_gan - 0.20574 00000097) perceptual - 105.88652; gen_gan - 0.68604; feature_matching - 2.73935; equivariance_value - 0.27513; equivariance_jacobian - 0.44143; disc_gan - 0.20712 00000098) perceptual - 106.11625; gen_gan - 0.68857; feature_matching - 2.73980; equivariance_value - 0.26853; equivariance_jacobian - 0.43948; disc_gan - 0.20513 00000099) perceptual - 105.84453; gen_gan - 0.68770; feature_matching - 2.73911; equivariance_value - 0.26908; equivariance_jacobian - 0.44045; disc_gan - 0.20662 """

My email is: qiulin_wang@foxmail.com Thanks in advance!

damengdameng commented 3 years ago

hi, I am trying to train this model on voxceleb2 too and I have some question. Are you Chinese? May I have your WeChat or QQ? I'm sorry if it is abrupt. @LKeaning