General process:
1) get sparse files through converting ActorsHQ to colmap format using toolbox export_colmap.py;
2) get image files through extracting only one frame of ActorsHQ dataset in 4x scale;
3) evaluate new dataset using 3D gaussian-splatting.
The possible reason is that when using colmap format to generate the initial point clouds, It seems that 0 points are obtained. Is it because the camera position obtained by conversion is used, resulting in no good initial image pair found?
Or is it because of the use of 4x scale data?
Dataset structure:
|---images
| |---
| |---
| |---...
|---sparse
|---0
|---cameras.bin
|---images.bin
|---points3D.bin
Error:
Training progress: 0%| | 0/30000 [00:00, ?it/s]Traceback (most recent call last):
File "train.py", line 219, in
training(lp.extract(args), op.extract(args), pp.extract(args), args.test_iterations, args.save_iterations, args.checkpoint_iterations, args.start_checkpoint, args.debug_from)
File "train.py", line 93, in training
loss.backward()
File "/home/test/anaconda3/envs/gaussian38-env/lib/python3.8/site-packages/torch/_tensor.py", line 488, in backward
torch.autograd.backward(
File "/home/test/anaconda3/envs/gaussian38-env/lib/python3.8/site-packages/torch/autograd/__init__.py", line 197, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: Function _RasterizeGaussiansBackward returned an invalid gradient at index 2 - got [0, 0, 3] but expected shape compatible with [0, 16, 3]
If using colmap to calculate the camera position, a rough result for the same picture can be got. The point cloud behind the person is very unreasonable.
General process: 1) get sparse files through converting ActorsHQ to colmap format using toolbox export_colmap.py; 2) get image files through extracting only one frame of ActorsHQ dataset in 4x scale; 3) evaluate new dataset using 3D gaussian-splatting.
The possible reason is that when using colmap format to generate the initial point clouds, It seems that 0 points are obtained. Is it because the camera position obtained by conversion is used, resulting in no good initial image pair found? Or is it because of the use of 4x scale data?
Dataset structure: