Fictionarry / TalkingGaussian

[ECCV'24] TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting
https://fictionarry.github.io/TalkingGaussian/
173 stars 23 forks source link

train_mouth.py训练报错 #15

Open yulj21 opened 1 month ago

yulj21 commented 1 month ago

Training progress: 4%|####2 | 1999/50000 [00:10<04:10, 191.30it/s, Loss=0.00217, AU25=1.2-1.3] [ITER 2000] Evaluating test: L1 0.02868325263261795 PSNR 15.645081138610841 [15/07 10:02:28]

[ITER 2000] Evaluating train: L1 0.029025918990373614 PSNR 15.591155815124512 [15/07 10:02:29] Training progress: 6%|######2 | 2999/50000 [00:17<04:06, 190.47it/s, Loss=0.00104, AU25=1.2-1.3]]Training progress: 8%|########4 | 3999/50000 [00:30<09:08, 83.87it/s, Loss=0.00201, AU25=1.1-1.3] [ITER 4000] Evaluating test: L1 0.02868325263261795 PSNR 15.645081138610841 [15/07 10:02:48]

[ITER 4000] Evaluating train: L1 0.029025918990373614 PSNR 15.591155815124512 [15/07 10:02:48] Training progress: 12%|############7 | 6000/50000 [00:57<08:52, 82.57it/s, Loss=0.00158, AU25=1.1-1.3] [ITER 6000] Evaluating test: L1 0.02868325263261795 PSNR 15.645081138610841 [15/07 10:03:15]

[ITER 6000] Evaluating train: L1 0.029025918990373614 PSNR 15.591155815124512 [15/07 10:03:16] Training progress: 16%|################9 | 7999/50000 [01:23<08:30, 82.27it/s, Loss=0.00150, AU25=1.0-1.3] [ITER 8000] Evaluating test: L1 0.02868325263261795 PSNR 15.645081138610841 [15/07 10:03:41]

[ITER 8000] Evaluating train: L1 0.029025918990373614 PSNR 15.591155815124512 [15/07 10:03:41] Training progress: 18%|###################2 | 9099/50000 [01:38<08:14, 82.74it/s, Loss=0.00154, AU25=1.0-1.3]Traceback (most recent call last): File "train_mouth.py", line 335, in training(lp.extract(args), op.extract(args), pp.extract(args), args.test_iterations, args.save_iterations, args.checkpoint_iterations, args.start_checkpoint, args.debug_from) File "train_mouth.py", line 148, in training render_pkg = render_motion_mouth(viewpoint_cam, gaussians, motion_net, pipe, background) File "/home/appuser/yulj21/talking_face/talkingGaussian/TalkingGaussian-track-stable-1/gaussian_renderer/init.py", line 238, in render_motion_mouth motion_preds = motion_net(pc.get_xyz, audio_feat) File "/home/appuser/yulj21/talking_face/talkingGaussian/anaconda37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, kwargs) File "/home/appuser/yulj21/talking_face/talkingGaussian/TalkingGaussian-track-stable-1/scene/motion_net.py", line 323, in forward enc_a = self.encode_audio(a) File "/home/appuser/yulj21/talking_face/talkingGaussian/TalkingGaussian-track-stable-1/scene/motion_net.py", line 296, in encode_audio enc_a = self.audio_net(a) # [1/8, 64] File "/home/appuser/yulj21/talking_face/talkingGaussian/anaconda37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, *kwargs) File "/home/appuser/yulj21/talking_face/talkingGaussian/TalkingGaussian-track-stable-1/scene/motion_net.py", line 63, in forward x = self.encoder_conv(x).squeeze(-1) File "/home/appuser/yulj21/talking_face/talkingGaussian/anaconda37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(input, kwargs) File "/home/appuser/yulj21/talking_face/talkingGaussian/anaconda37/lib/python3.7/site-packages/torch/nn/modules/container.py", line 139, in forward input = module(input) File "/home/appuser/yulj21/talking_face/talkingGaussian/anaconda37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/appuser/yulj21/talking_face/talkingGaussian/anaconda37/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 307, in forward return self._conv_forward(input, self.weight, self.bias) File "/home/appuser/yulj21/talking_face/talkingGaussian/anaconda37/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 304, in _conv_forward self.padding, self.dilation, self.groups) RuntimeError: CUDA error: invalid configuration argument CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Exception ignored in: <function tqdm.del at 0x148c5af108c0> Traceback (most recent call last): File "/home/appuser/yulj21/talking_face/talkingGaussian/anaconda37/lib/python3.7/site-packages/tqdm/std.py", line 1065, in del File "/home/appuser/yulj21/talking_face/talkingGaussian/anaconda37/lib/python3.7/site-packages/tqdm/std.py", line 1248, in close File "/home/appuser/yulj21/talking_face/talkingGaussian/anaconda37/lib/python3.7/site-packages/tqdm/std.py", line 564, in _decr_instances File "/home/appuser/yulj21/talking_face/talkingGaussian/anaconda37/lib/python3.7/site-packages/tqdm/_monitor.py", line 51, in exit File "/home/appuser/yulj21/talking_face/talkingGaussian/anaconda37/lib/python3.7/threading.py", line 522, in set File "/home/appuser/yulj21/talking_face/talkingGaussian/anaconda37/lib/python3.7/threading.py", line 365, in notify_all File "/home/appuser/yulj21/talking_face/talkingGaussian/anaconda37/lib/python3.7/threading.py", line 348, in notify TypeError: 'NoneType' object is not callable

我这边预处理中相机姿态估计的方法改成了sycntalk中相机姿态估计的方法。也就是我将transforms_train.json和transforms_val.json生成方式做了修改,其他数据预处理方法保持不变。但是在训练嘴巴区域模型时会报以上错误 @Fictionarry

Fictionarry commented 1 month ago

看报错位置应该跟pose应该没关系,像是audio feature的问题,有对audio做什么改动吗,pytorch报RuntimeError: CUDA error: invalid configuration argument也有可能单纯是因为显存不够了

yulj21 commented 1 month ago

我这边改了下scene/camera.py文件中的两个参数,一个是self.zfar=1,另一个是self.scale=4,结果就正常了。 参考的是geneface里面的参数,

yulj21 commented 1 month ago

看报错位置应该跟pose应该没关系,像是audio feature的问题,有对audio做什么改动吗,pytorch报RuntimeError: CUDA error: invalid configuration argument也有可能单纯是因为显存不够了

其他的地方都没有动,只有trannsforms_train.json transforms_val.json这两个文件不一致,个人感觉就是相机参数设置不合理