Fictionarry / TalkingGaussian

[ECCV'24] TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting
https://fictionarry.github.io/TalkingGaussian/
202 stars 26 forks source link

[windows 10, cuda 11.8] TypeError: grid_encode_forward() Incompatible Function Arguments:, FileNotFoundError: chkpnt_face_latest.pth, ZeroDivisionError: division by zero #42

Open linhcentrio opened 16 hours ago

linhcentrio commented 16 hours ago

(D:\Talking_head\SyncTalk\venv) D:\Gaussian\TalkingGaussian>.\scripts\train_xx.bat data\may output\may_project 0 Optimizing output\may_project Output folder: output\may_project [19/09 17:49:49] Found transforms_train.json file, assuming Blender data set! [19/09 17:49:49] Reading Training Transforms [19/09 17:49:49] 5520it [00:03, 1738.82it/s] 5520it [05:35, 16.45it/s] Reading Test Transforms [19/09 17:55:29] 553it [00:00, 1818.82it/s] 553it [00:37, 14.56it/s] Generating random point cloud (10000)... [19/09 17:56:11] Loading Training Cameras [19/09 17:56:13] Loading Test Cameras [19/09 17:56:37] Number of points at initialisation : 10000 [19/09 17:56:39] Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off] [19/09 17:56:45] D:\Talking_head\SyncTalk\venv\lib\site-packages\torchvision\models_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead. warnings.warn( D:\Talking_head\SyncTalk\venv\lib\site-packages\torchvision\models_utils.py:223: UserWarning: Arguments other than a weight enum or None for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing weights=AlexNet_Weights.IMAGENET1K_V1. You can also use weights=AlexNet_Weights.DEFAULT to get the most up-to-date weights. warnings.warn(msg) Loading model from: D:\Talking_head\SyncTalk\venv\lib\site-packages\lpips\weights\v0.1\alex.pth [19/09 17:56:45] Training progress: 4%|####4 | 2000/50000 [00:28<10:42, 74.71it/s, Loss=nan, AU25=1.2-1.3]D:\Talking_head\SyncTalk\venv\lib\site-packages\torch\utils\tensorboard[summary.py:444](http://summary.py:444/): RuntimeWarning: invalid value encountered in cast tensor = (tensor * scale_factor).clip(0, 255).astype(np.uint8)

[ITER 2000] Evaluating test: L1 0.11171912252902985 PSNR 13.904804420471192 [19/09 17:57:16]

[ITER 2000] Evaluating train: L1 0.11286026984453201 PSNR 13.85939292907715 [19/09 17:57:18] Training progress: 6%|######6 | 2990/50000 [00:46<10:55, 71.73it/s, Loss=nan, AU25=1.2-1.3]Traceback (most recent call last): File "D:\Gaussian\TalkingGaussian\train_mouth.py", line 328, in training(lp.extract(args), op.extract(args), pp.extract(args), args.test_iterations, args.save_iterations, args.checkpoint_iterations, args.start_checkpoint, args.debug_from) File "D:\Gaussian\TalkingGaussian\train_mouth.py", line 141, in training render_pkg = render_motion_mouth(viewpoint_cam, gaussians, motion_net, pipe, background) File "D:\Gaussian\TalkingGaussian\gaussian_renderer**init.py", line 238, in render_motion_mouth motion_preds = motion_net(pc.get_xyz, audio_feat) File "D:\Talking_head\SyncTalk\venv\lib\site-packages\torch\nn\modules[module.py](http://module.py/)", line 1501, in _call_impl return forward_call(*args, *kwargs) File "D:\Gaussian\TalkingGaussian\scene\motion_net.py", line 321, in forward enc_x = self.encode_x(x, bound=self.bound) File "D:\Gaussian\TalkingGaussian\scene\motion_net.py", line 312, in encode_x feat_xy = self.encoder_xy(xy, bound=bound) File "D:\Talking_head\SyncTalk\venv\lib\site-packages\torch\nn\modules[module.py](http://module.py/)", line 1501, in _call_impl return forward_call(args, kwargs) File "D:\Gaussian\TalkingGaussian\gridencoder[grid.py](http://grid.py/)", line 156, in forward outputs = grid_encode(inputs, self.embeddings, self.offsets, self.per_level_scale, self.base_resolution, inputs.requires_grad, self.gridtype_id, self.align_corners, self.interp_id) File "D:\Talking_head\SyncTalk\venv\lib\site-packages\torch\autograd[function.py](http://function.py/)", line 506, in apply return super().apply(*args, *kwargs) # type: ignore[misc] File "D:\Talking_head\SyncTalk\venv\lib\site-packages\torch\cuda\amp\autocast_mode.py", line 98, in decorate_fwd return fwd(args, **kwargs) File "D:\Gaussian\TalkingGaussian\gridencoder[grid.py](http://grid.py/)", line 54, in forward _backend.grid_encode_forward(inputs, embeddings, offsets, outputs, B, D, C, L, S, H, dy_dx, gridtype, align_corners, interpolation) TypeError: grid_encode_forward(): incompatible function arguments. The following argument types are supported:

  1. (arg0: torch.Tensor, arg1: torch.Tensor, arg2: torch.Tensor, arg3: torch.Tensor, arg4: int, arg5: int, arg6: int, arg7: int, arg8: float, arg9: int, arg10: Optional[torch.Tensor], arg11: int, arg12: bool) -> None

Invoked with: tensor([[nan, nan], [nan, nan], [nan, nan], ..., [nan, nan], [nan, nan], [nan, nan]], device='cuda:0', grad_fn=), Parameter containing: tensor([[ 8.5519e-05], [ 6.9541e-05], [ 6.5501e-05], ..., [-6.5308e-05], [ 9.4824e-05], [-7.8563e-06]], device='cuda:0', requires_grad=True), tensor([ 0, 4232, 8464, 12560, 16656, 20632, 24608, 28456, 32184, 35912, 39512, 43112, 46600], device='cuda:0', dtype=torch.int32), tensor([[[nan], [nan], [nan], ..., [nan], [nan], [nan]],

    [[nan],
     [nan],
     [nan],
     ...,
     [nan],
     [nan],
     [nan]],

    [[nan],
     [nan],
     [nan],
     ...,
     [nan],
     [nan],
     [nan]],

    ...,

    [[0.],
     [0.],
     [0.],
     ...,
     [0.],
     [0.],
     [0.]],

    [[0.],
     [0.],
     [0.],
     ...,
     [nan],
     [nan],
     [nan]],

    [[nan],
     [nan],
     [nan],
     ...,
     [nan],
     [nan],
     [nan]]], device='cuda:0'), 10000, 2, 1, 12, -0.013818463040458927, 64, tensor([[0., 0., 0.,  ..., 0., 0., 0.],
    [0., 0., 0.,  ..., 0., 0., 0.],
    [0., 0., 0.,  ..., 0., 0., 0.],
    ...,
    [0., 0., 0.,  ..., 0., 0., 0.],
    [0., 0., 0.,  ..., 0., 0., 0.],
    [0., 0., 0.,  ..., 0., 0., 0.]], device='cuda:0'), 0, False, 0

Training progress: 6%|######6 | 2990/50000 [00:57<15:04, 51.98it/s, Loss=nan, AU25=1.2-1.3] Optimizing output\may_project Output folder: output\may_project [19/09 17:57:53] Found transforms_train.json file, assuming Blender data set! [19/09 17:57:53] Reading Training Transforms [19/09 17:57:53] 5520it [00:03, 1691.19it/s] 5520it [05:29, 16.76it/s] Reading Test Transforms [19/09 18:03:27] 553it [00:00, 1663.55it/s] 553it [00:36, 15.14it/s] Generating random point cloud (2000)... [19/09 18:04:09] Loading Training Cameras [19/09 18:04:10] Loading Test Cameras [19/09 18:04:33] Number of points at initialisation : 2000 [19/09 18:04:34] Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off] [19/09 18:04:40] D:\Talking_head\SyncTalk\venv\lib\site-packages\torchvision\models_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead. warnings.warn( D:\Talking_head\SyncTalk\venv\lib\site-packages\torchvision\models_utils.py:223: UserWarning: Arguments other than a weight enum or None for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing weights=AlexNet_Weights.IMAGENET1K_V1. You can also use weights=AlexNet_Weights.DEFAULT to get the most up-to-date weights. warnings.warn(msg) Loading model from: D:\Talking_head\SyncTalk\venv\lib\site-packages\lpips\weights\v0.1\alex.pth [19/09 18:04:41] Training progress: 4%|####3 | 2000/50000 [00:27<10:13, 78.22it/s, Loss=nan, Mouth=5.7-16.6] [ITER 2000] Evaluating test: L1 0.1120919130350414 PSNR 13.891403951142962 [19/09 18:05:13]

[ITER 2000] Evaluating train: L1 0.11286026984453201 PSNR 13.85939292907715 [19/09 18:05:17] Traceback (most recent call last): File "D:\Gaussian\TalkingGaussian\train_face.py", line 394, in training(lp.extract(args), op.extract(args), pp.extract(args), args.test_iterations, args.save_iterations, args.checkpoint_iterations, args.start_checkpoint, args.debug_from) File "D:\Gaussian\TalkingGaussian\train_face.py", line 241, in training training_report(tb_writer, iteration, Ll1, loss, l1_loss, iter_start.elapsed_time(iter_end), testing_iterations, scene, motion_net, render if iteration < warm_step else render_motion, (pipe, background)) File "D:\Gaussian\TalkingGaussian\train_face.py", line 365, in training_report tb_writer.add_histogram("scene/opacity_histogram", scene.gaussians.get_opacity, iteration) File "D:\Talking_head\SyncTalk\venv\lib\site-packages\torch\utils\tensorboard[writer.py](http://writer.py/)", line 485, in add_histogram histogram(tag, values, bins, max_bins=max_bins), global_step, walltime File "D:\Talking_head\SyncTalk\venv\lib\site-packages\torch\utils\tensorboard[summary.py](http://summary.py/)", line 355, in histogram hist = make_histogram(values.astype(float), bins, max_bins) File "D:\Talking_head\SyncTalk\venv\lib\site-packages\torch\utils\tensorboard[summary.py](http://summary.py/)", line 399, in make_histogram raise ValueError("The histogram is empty, please file a bug report.") ValueError: The histogram is empty, please file a bug report. Training progress: 4%|####3 | 2000/50000 [00:46<18:46, 42.63it/s, Loss=nan, Mouth=5.7-16.6] Optimizing output\may_project Output folder: output\may_project [19/09 18:05:38] Found transforms_train.json file, assuming Blender data set! [19/09 18:05:38] Reading Training Transforms [19/09 18:05:38] 5520it [00:03, 1666.87it/s] 5520it [05:47, 15.90it/s] Reading Test Transforms [19/09 18:11:30] 553it [00:00, 1797.20it/s] 553it [00:36, 14.95it/s] Generating random point cloud (10000)... [19/09 18:12:12] Loading Training Cameras [19/09 18:12:14] Loading Test Cameras [19/09 18:12:39] Number of points at initialisation : 10000 [19/09 18:12:41] Traceback (most recent call last): File "D:\Gaussian\TalkingGaussian\train_fuse.py", line 261, in training(lp.extract(args), op.extract(args), pp.extract(args), args.test_iterations, args.save_iterations, args.checkpoint_iterations, args.start_checkpoint, args.debug_from) File "D:\Gaussian\TalkingGaussian\train_fuse.py", line 57, in training (model_params, motionparams, , _) = torch.load(os.path.join(scene.model_path, "chkpnt_face_latest.pth")) File "D:\Talking_head\SyncTalk\venv\lib\site-packages\torch[serialization.py](http://serialization.py/)", line 791, in load with _open_file_like(f, 'rb') as opened_file: File "D:\Talking_head\SyncTalk\venv\lib\site-packages\torch[serialization.py](http://serialization.py/)", line 271, in _open_file_like return _open_file(name_or_buffer, mode) File "D:\Talking_head\SyncTalk\venv\lib\site-packages\torch[serialization.py](http://serialization.py/)", line 252, in init super().init(open(name, mode)) FileNotFoundError: [Errno 2] No such file or directory: 'output\may_project\chkpnt_face_latest.pth' Looking for config file in output\may_project\cfg_args Config file found: output\may_project\cfg_args Rendering output\may_project Found transforms_train.json file, assuming Blender data set! [19/09 18:13:04] Reading Test Transforms [19/09 18:13:04] 553it [00:00, 1716.39it/s] 553it [00:30, 18.11it/s] Generating random point cloud (10000)... [19/09 18:13:36] Loading Training Cameras [19/09 18:13:36] Loading Test Cameras [19/09 18:13:40] Number of points at initialisation : 10000 [19/09 18:13:41] Traceback (most recent call last): File "D:\Gaussian\TalkingGaussian\synthesize_fuse.py", line 125, in render_sets(model.extract(args), args.iteration, pipeline.extract(args), args.use_train, args.fast, args.dilate) File "D:\Gaussian\TalkingGaussian\synthesize_fuse.py", line 93, in render_sets (model_params, motion_params, model_mouth_params, motion_mouth_params) = torch.load(os.path.join(dataset.model_path, "chkpnt_fuse_latest.pth")) File "D:\Talking_head\SyncTalk\venv\lib\site-packages\torch[serialization.py](http://serialization.py/)", line 791, in load with _open_file_like(f, 'rb') as opened_file: File "D:\Talking_head\SyncTalk\venv\lib\site-packages\torch[serialization.py](http://serialization.py/)", line 271, in _open_file_like return _open_file(name_or_buffer, mode) File "D:\Talking_head\SyncTalk\venv\lib\site-packages\torch[serialization.py](http://serialization.py/)", line 252, in init super().init(open(name, mode)) FileNotFoundError: [Errno 2] No such file or directory: 'output\may_project\chkpnt_fuse_latest.pth' Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off] D:\Talking_head\SyncTalk\venv\lib\site-packages\torchvision\models_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead. warnings.warn( D:\Talking_head\SyncTalk\venv\lib\site-packages\torchvision\models_utils.py:223: UserWarning: Arguments other than a weight enum or None for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing weights=AlexNet_Weights.IMAGENET1K_V1. You can also use weights=AlexNet_Weights.DEFAULT to get the most up-to-date weights. warnings.warn(msg) Loading model from: D:\Talking_head\SyncTalk\venv\lib\site-packages\lpips\weights\v0.1\alex.pth Traceback (most recent call last): File "D:\Gaussian\TalkingGaussian[metrics.py](http://metrics.py/)", line 215, in print(lmd_meter.report()) File "D:\Gaussian\TalkingGaussian[metrics.py](http://metrics.py/)", line 102, in report return f'LMD ({self.backend}) = {self.measure():.6f}' File "D:\Gaussian\TalkingGaussian[metrics.py](http://metrics.py/)", line 96, in measure return self.V / self.N ZeroDivisionError: division by zero

Fictionarry commented 16 hours ago

The problem is shown to be at the gridencoder. I see you are using the environment for SyncTalk, but the gridencoder implementation in our repo is a bit different from its. Try to reinstall it with our code or align the gridencoder in our repo with that in SyncTalk.

There is another problem at

raise ValueError("The histogram is empty, please file a bug report.")
ValueError: The histogram is empty, please file a bug report.

try to replace tensorboard with tensorboardx or some other version.

linhcentrio commented 29 minutes ago

thank you i'll try!