nerfstudio-project / gsplat

CUDA accelerated rasterization of gaussian splatting
https://docs.gsplat.studio/
Apache License 2.0
2.06k stars 258 forks source link

About ValueError: Invalid inputs. #227

Open takafire2222 opened 3 months ago

takafire2222 commented 3 months ago

There was no problem with normal output, but when I tried to do something with a complex image structure (complex internal structure, dark areas), an error occurred during evaluate and save and it stopped midway. . These colmap results are normal and inria's I was able to confirm the output correctly with 3DGS as well. Can these problems be solved by changing the parameters?

Running trajectory rendering... loss=0.020| sh degree=2| : 10%|▉ | 2999/30000 [02:32<22:52, 19.68it/s] Traceback (most recent call last): File "/home/tamura/ドキュメント/gsplat/examples/simple_trainer.py", line 959, in main(cfg) File "/home/tamura/ドキュメント/gsplat/examples/simple_trainer.py", line 949, in main runner.train() File "/home/tamura/ドキュメント/gsplat/examples/simple_trainer.py", line 629, in train self.render_traj(step) File "/home/tamura/anaconda3/envs/gsplat/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/home/tamura/ドキュメント/gsplat/examples/simple_trainer.py", line 871, in render_traj camtoworlds = generate_interpolated_path(camtoworlds, 1) # [N, 3, 4] File "/home/tamura/ドキュメント/gsplat/examples/datasets/traj.py", line 203, in generate_interpolated_path newpoints = interp( File "/home/tamura/ドキュメント/gsplat/examples/datasets/traj.py", line 196, in interp tck, = scipy.interpolate.splprep(pts.T, k=k, s=s) File "/home/tamura/anaconda3/envs/gsplat/lib/python3.10/site-packages/scipy/interpolate/_fitpack_py.py", line 155, in splprep res = _impl.splprep(x, w, u, ub, ue, k, task, s, t, full_output, nest, per, File "/home/tamura/anaconda3/envs/gsplat/lib/python3.10/site-packages/scipy/interpolate/_fitpack_impl.py", line 175, in splprep t, c, o = _fitpack._parcur(ravel(transpose(x)), w, u, ub, ue, k, ValueError: Invalid inputs.

liruilong940607 commented 3 months ago

Would you mind print out the shape of camtoworlds that passed into this line?

camtoworlds = generate_interpolated_path(camtoworlds, 1) # [N, 3, 4]
takafire2222 commented 3 months ago

Here is the shape of camtoworlds before and after interpolation:

Shape of camtoworlds before interpolation: torch.Size([10, 3, 4]) Shape of camtoworlds after interpolation: torch.Size([10, 3, 4])

It confirms that the shape of camtoworlds is [N, 3, 4] as expected.

liruilong940607 commented 3 months ago

I thought this function fails to execute as shown in the error log:

File "/home/tamura/ドキュメント/gsplat/examples/simple_trainer.py", line 871, in render_traj camtoworlds = generate_interpolated_path(camtoworlds, 1) # [N, 3, 4]

So how would you able to measure the shape of the camtoworlds after the interpolation?

takafire2222 commented 3 months ago

I'm not very familiar with programming, but is this information okay?

The contents of camtoworlds after interpolation Starting main function Shape of camtoworlds before interpolation: torch.Size([10, 3, 4]) generate_interpolated_path called Shape of camtoworlds after interpolation: torch.Size([10, 3, 4]) Completed interpolation Content of camtoworlds after interpolation: tensor([[[0.0706, 0.7401, 0.4573, 0.4221], [0.1126, 0.1021, 0.4360, 0.3684], [0.5584, 0.1822, 0.8845, 0.3133]],

    [[0.3311, 0.3613, 0.7854, 0.2104],
     [0.2344, 0.0198, 0.0820, 0.5654],
     [0.8157, 0.6907, 0.9596, 0.5781]],

    [[0.2290, 0.7226, 0.1915, 0.1439],
     [0.1503, 0.6764, 0.5628, 0.5573],
     [0.0786, 0.5162, 0.0972, 0.5271]],

    [[0.7302, 0.3348, 0.2023, 0.0054],
     [0.9734, 0.8753, 0.9030, 0.4296],
     [0.6481, 0.4661, 0.7746, 0.2984]],

    [[0.4653, 0.6836, 0.5487, 0.5810],
     [0.6248, 0.5368, 0.5969, 0.9453],
     [0.2523, 0.6431, 0.4677, 0.5896]],

    [[0.1300, 0.9209, 0.9227, 0.8738],
     [0.4211, 0.0942, 0.4536, 0.7542],
     [0.4549, 0.4952, 0.6872, 0.1631]],

    [[0.8658, 0.3393, 0.5105, 0.4059],
     [0.1329, 0.6367, 0.8706, 0.7304],
     [0.1179, 0.8475, 0.3125, 0.7577]],

    [[0.6417, 0.5594, 0.8690, 0.0740],
     [0.4093, 0.4246, 0.4103, 0.8570],
     [0.7129, 0.2891, 0.5988, 0.7379]],

    [[0.0500, 0.2237, 0.9972, 0.5653],
     [0.7796, 0.2982, 0.8908, 0.3028],
     [0.1683, 0.3071, 0.9822, 0.2362]],

    [[0.7841, 0.6828, 0.4587, 0.9681],
     [0.7572, 0.0911, 0.2762, 0.7928],
     [0.4650, 0.8512, 0.9476, 0.4231]]])

create_splats_with_optimizers called points shape: torch.Size([10, 3]), rgbs shape: torch.Size([10, 3]) End of main function

liruilong940607 commented 3 months ago

I mean it seems like there is no error raise out any more?

takafire2222 commented 3 months ago

Hello,

I have run the provided debug script and obtained the following results:

Script started Starting main function Shape of camtoworlds before interpolation: torch.Size([10, 3, 4]) generate_interpolated_path called Debug: pts content: tensor([[[-2.1962, -0.6603, -0.6294, -1.7900], [ 1.1420, -0.1879, 1.5968, 1.6885], [-1.0414, -0.3817, -0.5987, 0.6087]],

    [[ 2.1115, -0.8175,  0.5440, -0.0046],
     [ 1.0395,  0.7084,  0.3389,  1.0873],
     [ 0.8746, -0.5183,  0.9538,  0.3395]],

    [[-0.4195, -0.6707, -0.2268, -0.4810],
     [ 0.0948, -1.5273,  0.4883,  0.9671],
     [-0.5544, -0.2242, -0.4980,  1.4238]],

    [[-0.1288,  0.9611,  0.3164, -0.4025],
     [ 0.5596,  0.4118, -0.0164, -0.4582],
     [ 0.8875, -1.7476,  0.6625,  0.3395]],

    [[ 0.3967, -1.4957, -0.4463, -1.3182],
     [ 0.0948, -0.7989,  0.5119, -0.3858],
     [-1.9461,  0.3799, -0.5195, -0.2028]],

    [[-0.1321,  0.0439,  0.1922,  1.8908],
     [ 0.8252,  0.0322,  0.1938,  1.2223],
     [ 0.6194, -0.9109,  1.2139,  1.4074]],

    [[-0.9645,  0.7394, -1.3504,  0.9392],
     [ 0.8324, -0.4301,  0.4181,  0.5969],
     [ 0.6248,  1.0729,  0.6563, -0.0145]],

    [[-1.7841, -0.1717, -1.1129, -1.7336],
     [-1.2251, -2.8596,  1.0000,  0.4041],
     [-0.8354,  0.2260,  3.2420,  0.5723]],

    [[-0.1858,  0.1959,  0.0405,  0.9092],
     [-0.5503,  0.4971,  0.5174,  0.4175],
     [ 0.5475, -1.2011, -0.7638, -1.6576]],

    [[-0.4827,  0.7080,  0.0838, -0.8775],
     [ 0.4203, -0.2812,  0.9819,  0.9901],
     [-0.0042,  0.2946, -0.1178,  1.3483]]])

Debug: pts shape: torch.Size([10, 3, 4]) Debug: pts_2d shape: torch.Size([30, 4]) Shape of camtoworlds after interpolation: torch.Size([10, 3, 4]) Content of camtoworlds after interpolation: tensor([[[-2.1962e+00, -6.6029e-01, -6.2943e-01, -1.7900e+00], [ 2.8603e+00, -6.7420e-01, 2.8895e+00, 9.9786e-01], [ 1.5668e+00, -2.4461e-01, 1.9224e+00, 1.7127e+00]],

    [[-9.2617e-01, -1.1871e-01, -1.7556e-01,  1.1236e+00],
     [ 1.6435e-01, -9.1170e-01, -2.5875e-01,  4.3399e-02],
     [ 2.0748e+00, -5.3428e-01,  5.1973e-01,  1.7434e-01]],

    [[ 9.7432e-01,  2.9489e-01,  6.6272e-01,  9.3110e-01],
     [-2.6256e-01, -6.2501e-01, -8.8565e-02, -5.0323e-01],
     [ 9.0436e-02, -1.4978e+00,  4.4920e-01,  1.1160e+00]],

    [[-6.8566e-01,  5.2166e-01, -2.9006e-01,  8.1548e-01],
     [ 5.8005e-01,  3.8328e-01, -2.1694e-02, -4.4787e-01],
     [ 8.6797e-01, -1.7831e+00,  6.3642e-01,  2.9121e-01]],

    [[ 3.5278e-01, -1.2883e+00, -2.1617e-01, -1.1544e+00],
     [-1.1542e+00, -1.1506e-03,  9.0669e-02, -3.0198e-01],
     [-1.5890e+00,  2.5598e-01, -2.8826e-01,  1.0010e+00]],

    [[ 7.0721e-01,  9.3969e-02,  1.3206e-01,  1.3349e+00],
     [ 9.2185e-02, -8.3153e-01,  1.0297e+00,  1.4827e+00],
     [-9.9382e-01,  7.3525e-01, -1.3038e+00,  9.6950e-01]],

    [[ 5.9650e-01, -4.6900e-01,  6.9987e-02,  6.8060e-01],
     [ 4.2744e-01,  1.3041e+00,  5.4331e-01, -2.1468e-01],
     [-1.3752e+00,  6.1846e-01, -8.1949e-01, -1.6138e+00]],

    [[-1.8503e+00, -1.8412e+00, -8.8977e-01, -1.0813e+00],
     [-1.2158e+00, -2.8436e+00,  1.0554e+00,  4.6316e-01],
     [-1.0305e+00, -6.6721e-01,  3.3792e+00,  5.5729e-01]],

    [[-2.3240e-01,  2.6037e-01, -2.7030e-01,  1.3305e+00],
     [-3.8492e-01,  2.6039e-01,  4.1506e-01,  1.2762e+00],
     [ 1.8850e-01,  3.8784e-01,  7.3494e-01, -2.0212e+00]],

    [[ 2.1353e-01,  2.1914e-01,  2.2548e-01, -1.2070e+00],
     [ 1.4774e-01,  1.5301e-01,  1.5898e-01, -1.2140e+00],
     [ 4.0164e-02,  4.0422e-02,  4.0701e-02, -2.7646e+00]]])

Training started Running trajectory rendering for step 0... generate_interpolated_path called Debug: pts content: tensor([[[ 0.8339, -1.2545, 0.3813, -0.0421], [-0.3413, 1.8694, 0.8231, -0.0665], [ 0.9391, -1.7835, 0.2947, -0.1823]],

    [[ 1.3807,  0.5216, -0.1399, -1.4149],
     [-3.1499,  2.3331, -1.6001, -0.4556],
     [ 0.5410,  0.4911,  1.3966, -0.7591]],

    [[ 0.8918, -0.0500, -1.4649, -0.2370],
     [-0.1849,  0.7113, -1.4223, -0.4142],
     [-1.2199, -0.7775,  0.4216, -0.5643]],

    [[ 0.6901, -0.0342,  0.0847, -2.3543],
     [-1.1250,  1.0207, -0.6218, -0.9762],
     [-0.3382, -0.3230,  2.4693,  0.9308]],

    [[-1.2019,  1.1240,  0.3904, -2.0060],
     [ 1.5864, -0.2601, -1.3026,  1.7017],
     [ 0.9227,  0.1493, -0.9588, -0.2162]],

    [[ 1.2238, -1.0799,  0.8102,  0.8281],
     [-0.9117, -0.9950,  1.6551,  0.1360],
     [-1.5375, -0.3813, -0.1188,  0.6051]],

    [[-0.2602, -0.3427,  0.8445,  1.8650],
     [-2.2436,  0.8437, -0.5708, -0.2224],
     [-3.4643,  1.2501,  1.0376,  0.3426]],

    [[ 0.0987,  1.9461, -0.6535, -1.1288],
     [-0.7101,  0.4140, -2.4866,  1.0689],
     [ 0.1050, -0.7400,  0.9538, -1.4011]],

    [[-0.7950,  0.9394,  0.6057,  0.6874],
     [ 0.8221,  0.5632, -0.5103, -0.5093],
     [-0.7451, -0.9354,  2.0958, -1.5442]],

    [[-0.2440,  0.8683, -0.0038,  0.3143],
     [ 0.1519, -1.1672,  0.7936, -1.1774],
     [ 0.7067, -0.4843, -0.3113, -1.3088]]])

Debug: pts shape: torch.Size([10, 3, 4]) Debug: pts_2d shape: torch.Size([30, 4]) Shape of camtoworlds after render_traj interpolation: torch.Size([10, 3, 4]) Content of camtoworlds after render_traj interpolation: tensor([[[ 0.8339, -1.2545, 0.3813, -0.0421], [-0.3540, 2.0650, 0.8380, -0.1008], [ 0.4774, -1.3510, 0.4350, 0.0785]],

    [[ 1.5172, -0.1541,  0.0315, -1.1991],
     [-0.9174,  2.1570, -1.3423, -1.2082],
     [-3.1681,  2.2620, -1.3581, -0.4209]],

    [[-0.8922,  1.2396,  1.2630, -0.7466],
     [ 1.2190, -0.1852, -0.1215, -0.4137],
     [-0.4970,  0.6976, -1.2128, -0.4200]],

    [[-0.5751, -0.8944,  0.6177, -1.1557],
     [-0.0076,  0.6861, -0.5136, -2.0495],
     [-1.1096,  0.4408,  0.7183,  0.5150]],

    [[-0.4472, -0.1394,  2.4683,  0.2614],
     [-1.1865,  1.1309,  0.3327, -1.9957],
     [ 0.8526,  0.1365, -1.0813,  0.9721]],

    [[ 1.1861,  0.0515, -1.2343,  0.5380],
     [ 1.2761, -0.9838,  0.6276,  0.7535],
     [-1.2957, -0.8203,  1.2307,  0.0705]],

    [[-0.4800, -0.3694,  0.6419,  1.7460],
     [-1.5142,  0.5767, -0.4618,  0.1125],
     [-3.4190,  1.2975,  1.1617,  0.3464]],

    [[-0.5227,  1.9801,  0.1691, -1.0978],
     [-0.3229,  1.0949, -2.3079,  0.5026],
     [-0.3402, -0.6255, -1.2107, -0.0255]],

    [[-0.1305, -0.4218,  1.1667, -1.0781],
     [-0.2386,  1.0451, -0.0071,  0.5468],
     [ 0.4148, -0.4997,  0.7349, -1.5201]],

    [[-0.8360, -0.1738,  1.5380, -0.7623],
     [-0.0798,  0.2535,  0.1673, -0.0328],
     [ 0.7067, -0.4843, -0.3113, -1.3088]]])

End of main function Script ended

With this detailed information, it seems that the function is executing correctly without errors. However, if there are any specific concerns or additional checks required, please let us know.

liruilong940607 commented 3 months ago

So is this issue resolved?

takafire2222 commented 3 months ago

As a result of verifying both normal and complex image files, it was confirmed that errors always occur during the evaluation and saving steps for the complex files. The only difference between the two sets of files is the number of images to be processed (both are HD size).

.Normal image files: total of 775 images, and when evaluated, 96 images are output. (Located in gsplat/result/**/renders)

.Complex image files: total of 5943 images, but only 390 images are output. (It is expected to be around 700 images)

From this, I suspect that the issue may be related to the maximum number of images that can be processed, memory capacity, GPU memory, etc. (Memory: 128 GB, GPU: 6000 Ada 48G). Do you have any insights or suggestions regarding this issue?

liruilong940607 commented 3 months ago

5k images should be fine.

I'm confused now because the initial error you posted was occurred during the trajectory rendering stage:

File "/home/tamura/ドキュメント/gsplat/examples/simple_trainer.py", line 629, in train self.render_traj(step)

Which is executed after the evaluation stage: https://github.com/nerfstudio-project/gsplat/blob/0ac847d6efde31dba8c289c67241321a3cf9821f/examples/simple_trainer.py#L628-L629

And in your latest comment you report an issue about the evaluation stage does not produces expected amount of rendering images.

.Complex image files: total of 5943 images, but only 390 images are output. (It is expected to be around 700 images)

So in summary, there are in total two unexpected behaviors here:

Is that a correct summary?

If so, here is how I would suggest you to inspect it. For the first issue, you could add a couple lines to print out the number of image that the program is actually loading (print(len(self.valset), len(self.trainset)) after this line: https://github.com/nerfstudio-project/gsplat/blob/0ac847d6efde31dba8c289c67241321a3cf9821f/examples/simple_trainer.py#L245 The self.eval(step) is looping through the self.valset and render every image in it, as you can see here: https://github.com/nerfstudio-project/gsplat/blob/0ac847d6efde31dba8c289c67241321a3cf9821f/examples/simple_trainer.py#L802-L807

For the second issue, I could help look into it if you would like to provide the camtoworlds data that you passed into this line, by dumping it using np.save("camtoworlds.npy", camtoworlds) before this line, when you run the Complex images. https://github.com/nerfstudio-project/gsplat/blob/0ac847d6efde31dba8c289c67241321a3cf9821f/examples/simple_trainer.py#L873C50-L873C61

Or, if you don't feel convenient on providing that, could you print out the shape of the camtoworlds before the above line using the Complex images? I would need some information about what it looks like that triggers the error.

takafire2222 commented 3 months ago

Dear Developer,

Thank you for your continued support. I have followed your instructions to add debug prints and save the camtoworlds.npy file during the execution with the complex image set.

Question 1:

  1. I have added the following code to check the number of images in the validation and training sets:
    # Print the number of images in valset and trainset
    print(f"Number of images in valset: {len(self.valset)}, trainset: {len(self.trainset)}")

Here is the output from the script: Number of images in valset: [number of images in valset] Number of images in trainset: [number of images in trainset]

Debug: pts shape: (10, 9) Shape of camtoworlds after interpolation: (9, 3, 4) Content of camtoworlds after interpolation: [[[-0.50647863 0.66062449 0.55412515 0.5417786 ] [ 0.85434802 0.47130515 0.21899979 -0.51986248] [-0.11648541 0.58433444 -0.80310922 -0.1017779 ]]

[[ 0.6596637 -0.49972598 -0.5613535 1.06360254] [ 0.09653208 0.79706766 -0.59612474 0.55595313] [ 0.74533574 0.33905323 0.57403618 -1.9185439 ]] ...

Question 2:

  1. I have added the following code to save the camtoworlds.npy file before interpolation:
    # Save camtoworlds before interpolation for debugging
    np.save("camtoworlds.npy", camtoworlds)

The generated camtoworlds.npy file is attached for your inspection.

camtoworlds_before_interpolation.zip

I have also included a screen capture of the generated movie for further reference. Looking forward to your insights on resolving the issue. Best regards,

takafire2222 Screenshot from 2024-06-29 19-37-57