facebookresearch / meshtalk

Code for MeshTalk: 3D Face Animation from Speech using Cross-Modality Disentanglement
Other
368 stars 56 forks source link

why normalization operation like BN or LN not used in encoders and vetex_unet? #35

Closed xuzheyuan624 closed 1 year ago

xuzheyuan624 commented 1 year ago

When I train step1 on my own dataset, I found that the grads of encoder are too small (1e-10), it maybe caused by vanishing gradient. I think normalization operations may solve those problems.So, why normalization operation like BN or LN not used in encoders and vetex_unet?if it will have bad effects?

alexanderrichard commented 1 year ago

We didn't find it necessary. I did not observe convergence problems when training it, but as usual feel free to use normalization etc.

haonanhe commented 1 year ago

Hi, I'm facing the same problem when training on VOCASET, the gradients are so small that the generated faces are almost the same with the template (they cannot open their mouth, vertices are the same in one sequence). Have you solved the problem? @xuzheyuan624 I'll appreciate it so much if there could be any suggestions.