YudongGuo / AD-NeRF

This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".
MIT License
1.03k stars 175 forks source link

Training on new dataset but failed to get required poses for each frame (NaN value) #84

Open TimmmYang opened 2 years ago

TimmmYang commented 2 years ago

Hello, the problem was when I tried training on a new dataset(similar video frames with a talking head, [512, 512] for [H, W]), I failed to get poses (R, t) for each frame in step 6. I went through face_tracker.py and found that when the program regresses the poses in the final stage (after sample frames for each batch previously), one sample of the sel_euler and sel_trans turns NaN for a certain iteration after optimization step, the code line here: https://github.com/YudongGuo/AD-NeRF/blob/9fdcd7e1352aa74e26baa8e60c862f9fbd7933bf/data_util/face_tracking/face_tracker.py#L332

I check the loss before this step and it is normal around 34. Not sure what happened to the optimization step. What should I do to prevent this? Reducing learning rate or adjusting other hyper parameters will help? Or can I use other head pose detector like 3DDFA to get head poses?

Here is the loss changes in batch 1 iter 2 截屏2022-03-24 16 24 08

YudongGuo commented 2 years ago

You can modify this to blur_radius=0, faces_per_pixel=1.

TimmmYang commented 2 years ago

Hi, I just tried and still got NaN.

TimmmYang commented 2 years ago

It seems to be the problem of Pytorch3d. Is there any ways to solve it?

aurelianocyp commented 1 year ago

I also encountered this problem. Have you solved it now?

aurelianocyp commented 1 year ago

I tried blur_radius=0, faces_per_pixel=1and Nan disappeared