Data structure SCARED - Githubissues

Enzo-Kerkhof commented 6 months ago

Hi,

What is the specific data structure required to train on the SCARED dataset? I think I've followed as mentioned in the readme but even after extraction of the .tar.gz I still get the following error: File "/gpfs/home2/ekerkhof/EndoGaussian/scene/endo\_loader.py", line 292, in load\_meta frame\_ids = sorted([id[:-5] for id in os.listdir(calibs\_dir)]) Could you provide a little more detail on the structure needed of the SCARED dataset in order to run train.py with the provided config files? Do you additionally have some specification of potentially running on a self made dataset, by following either EndoNeRF or SCARED data structure and slightly adjusting config files?

Kind regards, Enzo

yifliu3 commented 6 months ago

Hi,

I'm wondering if you have used the code in https://github.com/EikoLoki/MICCAI_challenge_preprocess to process each frame of the SCARED dataset. After processing, the data structure should be like this:

After that, you can use the command to run the code. An example is provided below:

python train.py -s data/scared/dataset_1/keyframe_1/data --port 6017 --expname scared/d1k1 --configs arguments/scared/d1k1.py

yifliu3 commented 6 months ago

In addition, to run the code on the custom dataset, you only need to prepare images, depths, masks, camera intrinsics, and camera extrinsic for each view. For detailed information, please refer to scene/endo_loader.py and similarly prepare data as endonerf and scared datasets.

Enzo-Kerkhof commented 6 months ago

You are absolutely right, sorry for missing this part of the data preparation.

One last question, not relevant to this issue: With the EndoNeRF and EndoSurf repos, all point clouds are generated after rendering. With the outputs of the renders in this repo, the point clouds can still be created based on the rendered frames and depth images. Right? I however do not fully understand the point cloud that is in the last iteration folder, this is the Gaussian splat ply that can be used to render novel views using the deformation .pth files which are saved next to it? So this point cloud does not necessarily represent a moment in time but the entire scene it was trained on? Please correct me if I'm wrong.

Best, Enzo

yifliu3 commented 6 months ago

Yes, the point clouds can be created based on the rendered frames and depths. You can use the provided render.py and set the keyword recontruct=True in the render_set function to reconstruct the point cloud of each frame.

Additionally, the point_cloud.ply in fact saves the parameters of canonical Gaussians, which can be treated as the 3D Gaussians at the reference or initial time (so it indeed represents a moment). Anddeformation.pth contains the model parameters that can describe the deformation of each Gaussian at a certain timestamp. To render images for a certain time, we can feed the time to the deformation field to predict the deformation of each Gaussian, and then add the deformation to the canonical Gaussian, getting the deformed Gaussian at that time. Hope this can answer your questions.

Best, Yifan

Enzo-Kerkhof commented 6 months ago

Great! I got the point clouds from cutting. Pulling gave me an error:

Found poses_bounds.py and extra marks with EndoNeRf [22/02 17:10:37]
self.cameras_extent is  0.0 [22/02 17:10:37]
Loading Training Cameras [22/02 17:10:37]
Loading Test Cameras [22/02 17:10:37]
Loading Video Cameras [22/02 17:10:37]
Voxel Plane: set aabb= Parameter containing:
tensor([[  56.9742,   42.4396,  -38.0000],
        [ -57.2745,  -41.5493, -118.0000]], requires_grad=True) [22/02 17:10:37]
loading model from existsoutput/endonerf/pulling/point_cloud/iteration_3000 [22/02 17:10:38]
Traceback (most recent call last):
  File "render.py", line 200, in <module>
    render_sets(model.extract(args), hyperparam.extract(args), args.iteration, pipeline.extract(args), args.skip_train, args.skip_test, args.skip_video, args.reconstruct)
  File "render.py", line 131, in render_sets
    scene = Scene(dataset, gaussians, load_iteration=iteration, shuffle=False, load_coarse=dataset.no_fine)
  File "/gpfs/home2/ekerkhof/EndoGaussian/scene/__init__.py", line 73, in __init__
    iteration_str,
  File "/gpfs/home2/ekerkhof/EndoGaussian/scene/gaussian_model.py", line 233, in load_model
    self._deformation.load_state_dict(weight_dict)
  File "/home/ekerkhof/.conda/envs/EndoGaussian/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1672, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for deform_network:
    size mismatch for deformation_net.grid.grids.0.0: copying a param with shape torch.Size([1, 64, 64, 64]) from checkpoint, the shape in current model is torch.Size([1, 32, 32, 32]).
    size mismatch for deformation_net.grid.grids.0.1: copying a param with shape torch.Size([1, 64, 64, 64]) from checkpoint, the shape in current model is torch.Size([1, 32, 32, 32]).
    size mismatch for deformation_net.grid.grids.0.2: copying a param with shape torch.Size([1, 64, 100, 64]) from checkpoint, the shape in current model is torch.Size([1, 32, 50, 32]).
    size mismatch for deformation_net.grid.grids.0.3: copying a param with shape torch.Size([1, 64, 64, 64]) from checkpoint, the shape in current model is torch.Size([1, 32, 32, 32]).
    size mismatch for deformation_net.grid.grids.0.4: copying a param with shape torch.Size([1, 64, 100, 64]) from checkpoint, the shape in current model is torch.Size([1, 32, 50, 32]).
    size mismatch for deformation_net.grid.grids.0.5: copying a param with shape torch.Size([1, 64, 100, 64]) from checkpoint, the shape in current model is torch.Size([1, 32, 50, 32]).
    size mismatch for deformation_net.grid.grids.1.0: copying a param with shape torch.Size([1, 64, 128, 128]) from checkpoint, the shape in current model is torch.Size([1, 32, 64, 64]).
    size mismatch for deformation_net.grid.grids.1.1: copying a param with shape torch.Size([1, 64, 128, 128]) from checkpoint, the shape in current model is torch.Size([1, 32, 64, 64]).
    size mismatch for deformation_net.grid.grids.1.2: copying a param with shape torch.Size([1, 64, 100, 128]) from checkpoint, the shape in current model is torch.Size([1, 32, 50, 64]).
    size mismatch for deformation_net.grid.grids.1.3: copying a param with shape torch.Size([1, 64, 128, 128]) from checkpoint, the shape in current model is torch.Size([1, 32, 64, 64]).
    size mismatch for deformation_net.grid.grids.1.4: copying a param with shape torch.Size([1, 64, 100, 128]) from checkpoint, the shape in current model is torch.Size([1, 32, 50, 64]).
    size mismatch for deformation_net.grid.grids.1.5: copying a param with shape torch.Size([1, 64, 100, 128]) from checkpoint, the shape in current model is torch.Size([1, 32, 50, 64]).
    size mismatch for deformation_net.grid.grids.2.0: copying a param with shape torch.Size([1, 64, 256, 256]) from checkpoint, the shape in current model is torch.Size([1, 32, 128, 128]).
    size mismatch for deformation_net.grid.grids.2.1: copying a param with shape torch.Size([1, 64, 256, 256]) from checkpoint, the shape in current model is torch.Size([1, 32, 128, 128]).
    size mismatch for deformation_net.grid.grids.2.2: copying a param with shape torch.Size([1, 64, 100, 256]) from checkpoint, the shape in current model is torch.Size([1, 32, 50, 128]).
    size mismatch for deformation_net.grid.grids.2.3: copying a param with shape torch.Size([1, 64, 256, 256]) from checkpoint, the shape in current model is torch.Size([1, 32, 128, 128]).
    size mismatch for deformation_net.grid.grids.2.4: copying a param with shape torch.Size([1, 64, 100, 256]) from checkpoint, the shape in current model is torch.Size([1, 32, 50, 128]).
    size mismatch for deformation_net.grid.grids.2.5: copying a param with shape torch.Size([1, 64, 100, 256]) from checkpoint, the shape in current model is torch.Size([1, 32, 50, 128]).
    size mismatch for deformation_net.grid.grids.3.0: copying a param with shape torch.Size([1, 64, 512, 512]) from checkpoint, the shape in current model is torch.Size([1, 32, 256, 256]).
    size mismatch for deformation_net.grid.grids.3.1: copying a param with shape torch.Size([1, 64, 512, 512]) from checkpoint, the shape in current model is torch.Size([1, 32, 256, 256]).
    size mismatch for deformation_net.grid.grids.3.2: copying a param with shape torch.Size([1, 64, 100, 512]) from checkpoint, the shape in current model is torch.Size([1, 32, 50, 256]).
    size mismatch for deformation_net.grid.grids.3.3: copying a param with shape torch.Size([1, 64, 512, 512]) from checkpoint, the shape in current model is torch.Size([1, 32, 256, 256]).
    size mismatch for deformation_net.grid.grids.3.4: copying a param with shape torch.Size([1, 64, 100, 512]) from checkpoint, the shape in current model is torch.Size([1, 32, 50, 256]).
    size mismatch for deformation_net.grid.grids.3.5: copying a param with shape torch.Size([1, 64, 100, 512]) from checkpoint, the shape in current model is torch.Size([1, 32, 50, 256]).
    size mismatch for deformation_net.feature_out.0.weight: copying a param with shape torch.Size([32, 256]) from checkpoint, the shape in current model is torch.Size([32, 128]).

I'll try to retrain pulling and reconstructing again.

Enzo-Kerkhof commented 6 months ago

I'll try to retrain pulling and reconstructing again.

This worked.

yifliu3 commented 6 months ago

Great!

CUHK-AIM-Group / EndoGaussian

Data structure SCARED #8