total_loss is too high when training head

yerfor / GeneFacePlusPlus

GeneFace++: Generalized and Stable Real-Time 3D Talking Face Generation; Official Code

MIT License

1.5k stars 219 forks source link

total_loss is too high when training head #112

Open RayDean opened 6 months ago

RayDean commented 6 months ago

when I trained head NERF and training steps reached 250K, the total_loss is too high, nearly 580, and other loss seems normal. partial logs are :

| Validation results@248000: {'total_loss': 582.6377294922, 'mse_loss': 0.0012603372, 'sr_mse_loss': 0.0013412535, 'lpips_loss': 1.0247015435, 'sr_lpips_loss': 1.1453602004, 'sr_lip_lpips_loss': 1.0416638839, 'lambda_ambient': 579.4234008789} 03/06 04:17:08 PM Epoch 00000@248000: saving model to checkpoints/motion2video_nerf/meimei_head/model_ckpt_steps_248000.ckpt 03/06 04:17:08 PM Delete ckpt: model_ckpt_steps_246000.ckpt

is this high loss normal? or how can I lower down the total_loss? Thanks

RayDean commented 6 months ago

and when 250K steps finished, the final total_loss is inf, lpips_loss is inf, sr_lpips_loss is also inf

| Training end.. Epoch 0 ended. Steps: 250001. {'total_loss': inf, 'mse_loss': 0.0024544131240717033, 'weights_entropy_loss': 0.050688008500976045, 'num_non_facemask': 56165.82106877656, 'ambient_loss': 2.8842663650615378e-08, 'sr_mse_loss': 0.0008115496115366654, 'lambda_ambient': 469.427371226522, 'head_psnr': 27.943281164014728, 'density_grid_info_min_density': -1.0, 'density_grid_info_max_density': 364738707.3452703, 'density_grid_info_mean_density': 1790.830806371328, 'density_grid_info_occupancy_rate': 0.25496578732052366, 'density_grid_info_step_mean_count': 299778.5135135135, 'lpips_loss': inf, 'sr_lpips_loss': inf, 'sr_lip_lpips_loss': 1.1583641622033645}

Is the normal, how can I fix it? Thanks

Oyiyi commented 6 months ago

same issue

MohitPanpaliya commented 5 months ago

when I trained head NERF and training steps reached 250K, the total_loss is too high, nearly 580, and other loss seems normal. partial logs are :

| Validation results@248000: {'total_loss': 582.6377294922, 'mse_loss': 0.0012603372, 'sr_mse_loss': 0.0013412535, 'lpips_loss': 1.0247015435, 'sr_lpips_loss': 1.1453602004, 'sr_lip_lpips_loss': 1.0416638839, 'lambda_ambient': 579.4234008789} 03/06 04:17:08 PM Epoch 00000@248000: saving model to checkpoints/motion2video_nerf/meimei_head/model_ckpt_steps_248000.ckpt 03/06 04:17:08 PM Delete ckpt: model_ckpt_steps_246000.ckpt

is this high loss normal? or how can I lower down the total_loss? Thanks

Till what number of steps does the training of head nerf takes place and how much time it takes. can we stop the process and then resume it from the same checkpoints