yzslab / gaussian-splatting-lightning

A 3D Gaussian Splatting framework with various derived algorithms and an interactive web viewer
Other
527 stars 43 forks source link

issue in training close range images shots #5

Closed antoinebio closed 11 months ago

antoinebio commented 12 months ago

Hi , I trained many UAV dateset where target objets are around 20 meters away or farther gaussian-splatting-lighting produce awsome results, very impressive.

but I face another issue when I train ground base imageries. It's not from 360° action camera. the setup of the camera device is like that:

2 cameras, 4 meter height from the ground: 45° degree tilt from the vertical (nadir). 1 camera focused forward to the pedestrian walk second camera focused backward to the pedrestrian walk. lots of pictures taken but only focused on the ground. very good overlaps between pictures and bundle bloc adjustment is as accurate as my UAV dataset (what photogrammetry does and also called aerotriangulation).

We use to play with SFM algorithms such like what metashape offers. I am comparing that software vs gaussian-splatting 3D render.

I tried a first training with gaussian-splatting-lighting until the end, ie. 30 000 iterations but when I start the training I've got that error below

image

does it come from the datasets and too closed cameras ? (2 cameras poses per meter ; 1 foward / 1 backward)

just to let you know the training of that same dataset worked with https://github.com/jonstephens85/gaussian-splatting-Windows here is what that gaussian splatting gives me

image

but render is pretty bad not what I have got with UAV dataset

On my render I've got too many noisy splats

image

image

and ellipsoids display

image

and last the initial pointcloud layout

image

yzslab commented 12 months ago

Looks like the camera intrinsics do not match to your images.

This repository read image dimension from colmap's camera database, so an error is reported here. But the one you mentioned read image dimension from image files, so it won't find this mismatching, and produce a noisy result.

antoinebio commented 12 months ago

Hi @yzslab , thank you for the reply. it's strange as the COLMAP input creation comes from the same script ERRATUM*** https://github.com/agisoft-llc/metashape-scripts/blob/master/src/export_for_gaussian_splatting.py


what do you mean by camera database ? which file should I inspect ? image

with a SFM software such like Metashape we can get that kind of render

image

antoinebio commented 12 months ago

I found the issue. the script that converts metashape camera pose to COLMAP doesn't take into account camera groups.

is it possible to train 1 dataset and follow the train again with another dataset of the same area ?

yzslab commented 12 months ago

I found the issue. the script that converts metashape camera pose to COLMAP doesn't take into account camera groups.

is it possible to train 1 dataset and follow the train again with another dataset of the same area ?

Not implemented yet.

antoinebio commented 12 months ago

very strange phenomenon with Gaussian Splatting

if I run more iterations result is getting worse !

Here I am with gaussian splatting lightning and that ground based with a single camera from our 2 cams devices ...

300 iterations image

1000 iterations image

3000 iterations image

and here is what Gaussian Splatting gives

100 iterations image

500 iterations image

1000 iterations image

3000 iterations image

How can I set severals checkpoints in 1 train ? in order to prevent me to launch another train ? what is the argument ?

tkx

yzslab commented 12 months ago

Gaussian splatting has a densify mechanism, it will add some new gaussians every 100 iterations, and set all the alpha of the gaussians close to zero every 3K iterations. Those new gaussians have not been well optmized, so you will get a noisy result, and find the scene nearly disappear at 3K iterations. Use a checkpoint after densification finishing and all the new gaussians are well optimized, you will get better results. By default, densification finished at 15K iterations: https://github.com/yzslab/gaussian-splatting-lightning/blob/a0ddf55ce2497d41254fa1e00c17a3705b6dbdf9/internal/configs/optimization.py#L20

You can set when to save checkpoint through --save_iterations, e.g.: --save_iterations '[1000,2000,5000]'.

antoinebio commented 12 months ago

ok so I can set all those values (but I don't catch how it influence the result, I will perform empirical tests) default values are the following densification_interval: int = 100 densify_from_iter: int = 500 densify_until_iter: int = 15_000

image

Use a checkpoint after densification finishing and all the new gaussians are well optimized

what kind of value should I monitor during the training to ckeck that densification step ? is it explicetely returned by the terminal ?

yzslab commented 12 months ago

The simplest way is use the default optimization parameters, and use the last checkpoint.

yzslab commented 12 months ago

The simplest way is use the default optimization parameters, and use the last checkpoint.

Leave all the parameter unchanged, including the max steps (--max_steps), except experiment name (-n) and version (-v).

antoinebio commented 12 months ago

The simplest way is use the default optimization parameters, and use the last checkpoint.

Leave all the parameter unchanged, including the max steps (--max_steps), except experiment name (-n) and version (-v).

it's actually not better

image

image

yzslab commented 12 months ago

The simplest way is use the default optimization parameters, and use the last checkpoint.

Leave all the parameter unchanged, including the max steps (--max_steps), except experiment name (-n) and version (-v).

it's actually not better

image

image

Try this config file: https://github.com/yzslab/gaussian-splatting-lightning/blob/main/configs/larger_dataset.yaml

antoinebio commented 12 months ago

@yzslab

_Try this config file: https://github.com/yzslab/gaussian-splatting-lightning/blob/main/configs/larger_dataset.yaml_

looks better at checkpoint 6999 (iterations)

image

will wait until the end (quite long to reach 30K iterations)

how do you attribute the "correct" values for all thoses variables ?

save_val_output: true max_save_val_output: 8 gaussian: optimization: position_lr_init: 0.000016 scaling_lr: 0.001

yzslab commented 12 months ago

how do you attribute the "correct" values for all thoses variables ?

The scene size (calculated from camera poses, not the same as the size in real world) will affect the learning rates. May be the reconstruction of the metashape is too large, which cause an unsuitable LR. The larger_dataset.yaml can smaller it.

yzslab commented 12 months ago

will wait until the end (quite long to reach 30K iterations)

The training will be slow if you using a high image resolution.

antoinebio commented 12 months ago

i am testing 378 images taken from 1 inch camera sensor. images are 4800 x 3200 pix

antoinebio commented 12 months ago

The scene size (calculated from camera poses, not the same as the size in real world) will affect the learning rates. May be the reconstruction of the metashape is too large, which cause an unsuitable LR.

what do you mean ? those ones ?

  position_lr_init: 0.000016
  scaling_lr: 0.001

what are their units ? should I increase? decrease ?

yzslab commented 12 months ago

The scene size (calculated from camera poses, not the same as the size in real world) will affect the learning rates. May be the reconstruction of the metashape is too large, which cause an unsuitable LR.

what do you mean ? those ones ?

  position_lr_init: 0.000016
  scaling_lr: 0.001

what are their units ? should I increase? decrease ?

There are no units. You need to do experiments to find the suitable values for your scene.

antoinebio commented 12 months ago

what about val folder ?

I can see interesting image comparison between ground truth and what the training creates as an evalution of the train (?)

image

yzslab commented 12 months ago

what about val folder ?

I can see interesting image comparison between ground truth and what the training creates as an evalution of the train (?)

image

It simply select a part of your images (1% for larger_dataset.yam) to do the evaluation so as to report the reconstruction quality. You can define how the imags are selected here: https://github.com/yzslab/gaussian-splatting-lightning/blob/a0ddf55ce2497d41254fa1e00c17a3705b6dbdf9/configs/larger_dataset.yaml#L4-L6

yzslab commented 12 months ago

You can define how the imags are selected here

https://github.com/yzslab/gaussian-splatting-lightning/blob/a0ddf55ce2497d41254fa1e00c17a3705b6dbdf9/internal/configs/dataset.py#L29-L35

antoinebio commented 12 months ago

You can define how the imags are selected here

https://github.com/yzslab/gaussian-splatting-lightning/blob/a0ddf55ce2497d41254fa1e00c17a3705b6dbdf9/internal/configs/dataset.py#L29-L35

Can I use those variables as arguments and modify then when I train the model ?

yzslab commented 12 months ago

You can define how the imags are selected here

https://github.com/yzslab/gaussian-splatting-lightning/blob/a0ddf55ce2497d41254fa1e00c17a3705b6dbdf9/internal/configs/dataset.py#L29-L35

Can I use those variables as arguments and modify then when I train the model ?

Of course, take a look the output of this command: python main.py fit --help. You also can change these values via yaml file.

antoinebio commented 12 months ago

the training is very long ... more than 3h for the moment with default larger_dataset.config (and still not complete)

https://github.com/yzslab/gaussian-splatting-lightning/blob/a0ddf55ce2497d41254fa1e00c17a3705b6dbdf9/configs/larger_dataset.yaml#L4-L6 I should play with thoses arguments but how can I catch their definitions ?

source is not verbose https://github.com/yzslab/gaussian-splatting-lightning/blob/a0ddf55ce2497d41254fa1e00c17a3705b6dbdf9/internal/configs/dataset.py#L29-L35

what is that eval_ratio return ?

I quite understand the split_mode: reconstruction: train model use all images;

it must be the default train mode .

Is it long because of my 378 x ( 4800 x 3200 pix) pictures as input ?

how about experiment: withholding a test set for evaluation. Is it a mode that I can use with checkpoint ? image

yzslab commented 12 months ago

the training is very long

You need to manually resize your images (e.g., using ImageMagick: mogrify -quality 100 -resize 25% -path images_4/ images/*), then use option --data.params.colmap.down_sample_factor 4 (4 means 25% and load images from the directory named images_4) to tell main.py to use the down sampled images.

yzslab commented 12 months ago

Is it a mode that I can use with checkpoint ?

Pull the latest version, you can resume training from checkpoint through the option --ckpt_path: python main.py fit --config ... --data.path ... --ckpt_path CKPT_FILE_PATH.

antoinebio commented 12 months ago

mogrify -quality 100 -resize 25% -path images_4/ images/*

it's not working. what is the exact command with ImageMagick ?

image

and neither with powershell

image

antoinebio commented 12 months ago

the training is very long

You need to manually resize your images (e.g., using ImageMagick: mogrify -quality 100 -resize 25% -path images_4/ images/*), then use option --data.params.colmap.down_sample_factor 4 (4 means 25% and load images from the directory named images_4) to tell main.py to use the down sampled images.

if I resize the sources images (dowscaled to 4) it means I have 1188 x 773 pix so my camera model won't match.... are you sure with that ?

yzslab commented 12 months ago

if I resize the sources images (dowscaled to 4) it means I have 1188 x 773 pix so my camera model won't match.... are you sure with that ?

The camera intrinsics will be adjusted automatically according to the value of --data.params.colmap.down_sample_factor.

antoinebio commented 12 months ago

if I resize the sources images (dowscaled to 4) it means I have 1188 x 773 pix so my camera model won't match.... are you sure with that ?

The camera intrinsics will be adjusted automatically according to the value of --data.params.colmap.down_sample_factor.

nope...

image

and that issue

image

yzslab commented 12 months ago

if I resize the sources images (dowscaled to 4) it means I have 1188 x 773 pix so my camera model won't match.... are you sure with that ?

The camera intrinsics will be adjusted automatically according to the value of --data.params.colmap.down_sample_factor.

nope...

image

and that issue

image

Rename downscaled_4 to images_4, and place it to the same location as the images:

<dataset_directory>
|-- sparse         # colmap sparse model
    ...
|-- images
    |-- 0.png
    |-- 1.png
    ...
|-- images_4    # down sampled images
    |-- 0.png
    |-- 1.png
    ...

Then run python main.py fit ... --data.path <dataset_directory> --data.params.colmap.down_sample_factor 4 ....

antoinebio commented 12 months ago

Ok thanks for the trivial organisation (same way than nerfstudio)

I think the strategy of that ground base shot is not optimize for gaussian splatting. I understand GS needs lots of multi-stereo views, not nadir shots such like old shool photogrammetry workflow.

with a orbit around a building or an objet GS works well.

One has to take care of the acquisition procedure. Gaussian Splatting might not defeat SFM. anyways I will test with dowscaled source images

here is how my ground base dataset looks like (not many overlaps).

image

image

Equirectangular imageries should better work with GS.

antoinebio commented 12 months ago

not working better with dowscaled folder

image

still the same issue

image

either folder /images and /images_4 have the same number of photos...

yzslab commented 12 months ago

not working better with dowscaled folder

image

still the same issue

image

either folder /images and /images_4 have the same number of photos...

Change torch.round to torch.floor here: https://github.com/yzslab/gaussian-splatting-lightning/blob/c569758cc22b1fef5125059a04f2d70dff0205b3/internal/dataparsers/colmap_dataparser.py#L303-L304

I will fix it later.

antoinebio commented 12 months ago

let's train again !

image

antoinebio commented 12 months ago

quite faster and better in quality but still not acceptable

image

image

with image

here it is ...

image

what about gaussian and optimisation fields ? What kind of value do you input here and is it a question of amount of dataset ? context ?

image

yzslab commented 12 months ago

May be you should use colmap SfM rather than metashape.

antoinebio commented 12 months ago

this is what I am testing with NERFSTUDIO right now... but it's rather not the right way compare to the performance of metashape (or reality capture).

As lots of others persons mentionned it in previous posts , COLMAP is drastically longer and less efficient that metashape for key pts detection and tie pts (matching pairs)

image

do you think COLMAP preprocess (image alignement) and inputs would give me better results than Metashape ?

the initial dataset required for Gaussian Splatting is camera poses and tiepts. I am not sure tie pts generated by COLMAP are more accurate... maybe COLMAP gives less points.

I my case (on that street level projet) metashape provides 365078 points for 378 images

image

variances image

I can also try to decimate the tie pts pointcloud and test again...

antoinebio commented 12 months ago

with 75% decimation of the intial tiespts pointcloud from metashape alignment ((25% remaining pts at random) number of point at initialisation of the GS) here it is... so it's not a question of amount of 3D points input during the training...

image

image

antoinebio commented 11 months ago

I tested another dataset (close range photogrammetry) from a sony alpha 6000 DSLR

but I've got that issue during the training (few sec after the launch of the command)

image

my inputs look consistent ?

image

also please note that both folders (images and images_4) have the same numbers of photo

yzslab commented 11 months ago

Change torch.floor back to torch.round may fix this problem.

On Wed, Nov 15, 2023 at 21:27 antoine @.***> wrote:

I tested another dataset from a sony alpha 6000 DSLR

but I've got that issue during the training (few sec after the launch of the command)

[image: image] https://user-images.githubusercontent.com/30265851/283129478-fdf01882-a680-4c8e-b940-dc90cc60fd06.png

my inputs look consistent ?

each folder (images and images_4) have the same numbers of photo

[image: image] https://user-images.githubusercontent.com/30265851/283129396-a409ba5d-2b2d-455a-a267-23532da6dcc1.png

— Reply to this email directly, view it on GitHub https://github.com/yzslab/gaussian-splatting-lightning/issues/5#issuecomment-1812536418, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEJZCMMBDBQRTLCLE2VJO3YES7K7AVCNFSM6AAAAAA7IJLUGSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMJSGUZTMNBRHA . You are receiving this because you were mentioned.Message ID: @.***>

antoinebio commented 11 months ago

indeed ... tkx

antoinebio commented 11 months ago

Hi again @yzslab i have a large dataset (~1600 images / 6454 x 3898 pix) that I am training with your repos

image

and for each iterations it takes around 30min.

this is too long to train.

I don't know if I can end or clean stop the process to a given EPOCH...

is it equivalent to _you can resume training from checkpoint through the option --ckpt_path: python main.py fit --config ... --data.path ... --ckpt_path CKPT_FILEPATH.

I also tried a new train with addition of dowscaled images = 8 (images_8 folder added with images_4 folder) but it fails to train (I always try to tweek that torch.round <=> torch.floor ...)

image

If I keep using that larger_dataset.yaml,

SHould I change eval_value , eg. eval_ratio ?

Can I monitor or change the below values from config.yaml

image


lightning.pytorch==2.0.9.post0

seed_everything: 42 trainer: accelerator: gpu strategy: auto devices: 1 num_nodes: 1 precision: 32-true logger: class_path: lightning.pytorch.loggers.TensorBoardLogger init_args: save_dir: C:\gaussian-splatting-lightning\outputs\DRONE_SPDP callbacks: null fast_dev_run: false min_epochs: null min_steps: null max_time: null limit_train_batches: null limit_val_batches: null limit_test_batches: null limit_predict_batches: null overfit_batches: 0.0 val_check_interval: null check_val_every_n_epoch: 1 num_sanity_val_steps: 1 log_every_n_steps: null enable_checkpointing: false enable_progress_bar: null enable_model_summary: null accumulate_grad_batches: 1 gradient_clip_val: null gradient_clip_algorithm: null deterministic: null benchmark: null inference_mode: true use_distributed_sampler: false profiler: null detect_anomaly: false barebones: false plugins: null sync_batchnorm: false reload_dataloaders_every_n_epochs: 0 default_root_dir: null model: gaussian: optimization: position_lr_init: 1.6e-05 position_lr_final: 1.6e-06 position_lr_delay_mult: 0.01 position_lr_max_steps: 30000.0 feature_lr: 0.0025 opacity_lr: 0.05 scaling_lr: 0.001 rotation_lr: 0.001 percent_dense: 0.01 lambda_dssim: 0.2 densification_interval: 100 opacity_reset_interval: 3000 densify_from_iter: 500 densify_until_iter: 15000 densify_grad_threshold: 0.0002 rgb_diff_loss: l1 sh_degree: 3 camera_extent_factor: 1.0 background_color:

yzslab commented 11 months ago

I also tried a new train with addition of dowscaled images = 8 (images_8 folder added with images_4 folder) but it fails to train (I always try to tweek that torch.round <=> torch.floor ...)

6454 / 8 = 806.75. Please check that whether you have put 8× down sampled images to the images_8, and whether the diemension in the cameras intrinsics are 6454×3898.

antoinebio commented 11 months ago

I also tried a new train with addition of dowscaled images = 8 (images_8 folder added with images_4 folder) but it fails to train (I always try to tweek that torch.round <=> torch.floor ...)

6454 / 8 = 806.75. Please check that whether you have put 8× down sampled images to the images_8, and whether the diemension in the cameras intrinsics are 6454×3898.

downscale = 8 means 12.5% of original image size ? Am I correct ?

antoinebio commented 11 months ago

with torch.round

it gives

image

and with torch.floor

It gives

image

so now if I try

image

the process is not speed up but I can check intermediate checkpoints

how can I end the train at a certain EPOCH ?

yzslab commented 11 months ago

how can I end the train at a certain EPOCH ?

--max_steps -1 --max_epochs YOUR_TARGET_EPOCH

yzslab commented 11 months ago

Did you use ImageMagick to resize the image?

yzslab commented 11 months ago

According to the document of the ImageMagick:

pixel size of the image will be rounded to the nearest integer

So the 8× downsampled image should be 807×487. I tried just now and it is indeed 807. But why your image width is 806?

antoinebio commented 11 months ago

I passed the test with dowscaled = 8

but the render is not 100% satisfactory (even though the train took 30min...)

image

image

so it's a trade off between training time and render quality (remaining noisy splats)...

I also modified densify_until_iter: to 5000 (15000 was the default value).

what is exactly that densify setting ? is it a question of texture complexity into my source images ?

antoinebio commented 11 months ago

According to the document of the ImageMagick:

pixel size of the image will be rounded to the nearest integer

So the 8× downsampled image should be 807×487. I tried just now and it is indeed 807. But why your image width is 806?

I used a equivalent batch resampling tool (irfanview, fastone viewer) will try with imageMagick too...