How to create custom datasets for training

JHXion9 commented 1 month ago

Nice work! But I still have some questions. I can't create my own dataset for training by just reading the readme.md, as I'm not clear on what kind of data is needed for a training dataset. I have 16 cams, each containing 150 frames. What should I do to make my dataset suitable for training? Here is my data distribution in 4DGS Best! Snipaste_2024-09-03_21-57-41

Duisterhof commented 1 month ago

Hello,

What is in the poses_bounds_multipleview.npy? We have provided code to get from robo360 format to json format. Essentially we use a normal NeRF json, but add the intrinsics to each frame to accommodate multiple cameras. Can you share your dataset?

JHXion9 commented 1 month ago

Thank you for your prompt reply!

There are 16 6-DoF camera poses and near/far depth bounds for the scene in poses_bounds_multipleview.npy, which are getting from LLFF/imgs2poses.py https://github.com/Fyusion/LLFF/blob/master/imgs2poses.py.

With the robo360 dataset construction method, I successfully constructed my dataset. Detail is shown below: Snipaste_2024-09-05_13-14-04

There is a small error, there is no information about frames within the transforms_test.json file and there is information about 300 frames within the transforms_val.json file, when I train, I need to swap the two filenames to train successfully. Also, when I was training, I realized that the point cloud was not able to fit my data completely, as shown in the picture: this is the result I got by training with 30000 iter.

Snipaste_2024-09-05_13-12-45

This looks like cameras pose error, and my poses_bounds.npy construct is obtained with the following instruction： ` workdir=$1 python scripts/extractimages.py multipleview/$workdir

colmap feature_extractor --database_path ./colmap_tmp/database.db --image_path ./colmap_tmp/images --SiftExtraction.max_image_size 4096 --SiftExtraction.max_num_features 16384 --SiftExtraction.estimate_affine_shape 1 --SiftExtraction.domain_size_pooling 1 --ImageReader.camera_model SIMPLE_PINHOLE colmap exhaustive_matcher --database_path ./colmap_tmp/database.db mkdir ./colmap_tmp/sparse colmap mapper --database_path ./colmap_tmp/database.db --image_path ./colmap_tmp/images --output_path ./colmaptmp/sparse mkdir ./data/multipleview/$workdir/sparse

cp -r ./colmaptmp/sparse/0/* ./data/multipleview/$workdir/sparse

mkdir ./colmap_tmp/dense colmap image_undistorter --image_path ./colmap_tmp/images --input_path ./colmap_tmp/sparse/0 --output_path ./colmap_tmp/dense --output_type COLMAP colmap patch_match_stereo --workspace_path ./colmap_tmp/dense --workspace_format COLMAP --PatchMatchStereo.geom_consistency true colmap stereo_fusion --workspace_path ./colmap_tmp/dense --workspace_format COLMAP --input_type geometric --output_path ./colmap_tmp/dense/fused.ply

python scripts/downsample_point.py ./colmap_tmp/dense/fused.ply ./data/multipleview/$workdir/points3D_multipleview.ply

python LLFF/imgs2poses.py ./colmap_tmp/

cp ./colmap_tmp/poses_bounds.npy ./data/multipleview/$workdir/poses_bounds.npy ` I generate the json file with the following command： 222

My dataset is here：

Do you think something is wrong with my dataset? here are my training instructions: 111

Best!

Duisterhof commented 1 month ago

Thanks for the note on json files! I'll fix that. It's a little bit hard to say what's the issue in your case, it could indeed be camera pose error (accuracy but also the scale has to be bounded, as we initialize Gaussians in a cube of fixed size), or the scene is underconstrained by 16 cameras. You don't initialize with a point cloud here right? A good debugging step could be to take the frames and calibration at t=0 and feed it into an off-the-shelf Gaussian splatting module. I like GSplat from Nerf Studio.

Another note, 16 cameras is pretty challenging for Colmap in general, have you considered using something like Mast3r as an initialization?

momentum-robotics-lab / deformgs

How to create custom datasets for training #3