Training on new dataset but failed to get required poses for each frame (NaN value)

TimmmYang commented 2 years ago

Hello, the problem was when I tried training on a new dataset(similar video frames with a talking head, [512, 512] for [H, W]), I failed to get poses (R, t) for each frame in step 6. I went through face_tracker.py and found that when the program regresses the poses in the final stage (after sample frames for each batch previously), one sample of the sel_euler and sel_trans turns NaN for a certain iteration after optimization step, the code line here: https://github.com/YudongGuo/AD-NeRF/blob/9fdcd7e1352aa74e26baa8e60c862f9fbd7933bf/data_util/face_tracking/face_tracker.py#L332

I check the loss before this step and it is normal around 34. Not sure what happened to the optimization step. What should I do to prevent this? Reducing learning rate or adjusting other hyper parameters will help? Or can I use other head pose detector like 3DDFA to get head poses?

Here is the loss changes in batch 1 iter 2 截屏2022-03-24 16 24 08

YudongGuo commented 2 years ago

You can modify this to blur_radius=0, faces_per_pixel=1.

TimmmYang commented 2 years ago

Hi, I just tried and still got NaN.

TimmmYang commented 2 years ago

It seems to be the problem of Pytorch3d. Is there any ways to solve it?

aurelianocyp commented 1 year ago

I also encountered this problem. Have you solved it now?

aurelianocyp commented 1 year ago

I tried blur_radius=0, faces_per_pixel=1and Nan disappeared

YudongGuo / AD-NeRF

Training on new dataset but failed to get required poses for each frame (NaN value) #84