Any suggestion on parameter tuning for large-scale dataset/scene

HrsPythonix commented 1 year ago

I'd like to test 3d_gaussian_splatting on larger-scale scene, could you give some suggestions on how to modify the parameters in config?

wanmeihuali commented 1 year ago

I haven't tested it on large scenes yet, but someone did tell me it works on larger scenes.

I can provide some further explanation for the current parameters:

adaptive-controller-config:
  # The threshold on the gradient of point position in the camera plane for all affected pixels
  # decrease it->more points, slower train/render speed, better quality
  densification-view-space-position-gradients-threshold: 6e-6 

  # Similar, but average by num of pixels affected, not really used
  densification_view_avg_space_position_gradients_threshold: 1e3

  # Similar, but averaged by all frames, not really used
  densification_multi_frame_view_space_position_gradients_threshold: 1e3

  # Similar, but averaged by all frames and pixels, not really used
  densification_multi_frame_view_pixel_avg_space_position_gradients_threshold: 4e3

  # # Similar, but on the gradient of the position of a point in 3D space, not really used. In theory, it shall add more points to the background and improve background quality, but I haven't found a good value for it.
  densification_multi_frame_position_gradients_threshold: 1e3
  gaussian-split-factor-phi: 1.6

  # The frequency to split/clone points
  num-iterations-densify: 100

  # The frequency to reset all point opacity, if you have more images, maybe you shall increase this value, otherwise points may still be transparent.
  num-iterations-reset-alpha: 4000

  # Wait some iterations before begin to densify/removing transparent points, shall increase this value for a larger scene
  num-iterations-warm-up: 500
  reset-alpha-value: -1.9
  transparent-alpha-threshold: -2.0

  # Floater removal is not really used here, if you see floater during training, try decrease it
  floater_num_pixels_threshold: 400000
  floater_near_camrea_num_pixels_threshold: 300000
  iteration_start_remove_floater: 2000
  under_reconstructed_num_pixels_threshold: 32
  enable_sample_from_point: True
gaussian-point-cloud-scene-config:
  # The upper bound for num of points is max-num-points-ratio * initial_num_of_points, if you are using Colmap, then 10 shall be good, the demo truck scene contains 430k points at last.
  max-num-points-ratio: 10.0
  num-of-features: 56

  # Whether add a sphere as background, no effect for the demo scene, if your scene contains a lot of sky, leave it as True.
  add_sphere: True
  initial_alpha: 0.05
  max_initial_covariance: 3000.0
  initial_covariance_ratio: 0.1
increase-color-max-sh-band-interval: 1000.0
log-image-interval: 2000
log-loss-interval: 10
log-metrics-interval: 100
print-metrics-to-console: True
enable_taichi_kernel_profiler: True
log_taichi_kernel_profile_interval: 3000
log_validation_image: True
feature_learning_rate: 0.005
position_learning_rate: 0.00005
position_learning_rate_decay_rate: 0.9947
position_learning_rate_decay_interval: 100
loss-function-config:
  lambda-value: 0.2
  enable_regularization: False
  regularization_weight: 0.005
num-iterations: 30001
pointcloud-parquet-path: '/opt/ml/input/data/training/point_cloud.parquet'
rasterisation-config:

  # depth of points needs to be converted into an integer for radix sort, 
  # sort key is computed as int(depth-to-sort-key-scale*depth), you can increase it, but make sure depth fits in int32
  depth-to-sort-key-scale: 10.0
  # threshold for the farthest point, can be very large, just make sure depth-to-sort-key-scale * far-plane fits in int32
  far-plane: 2000.0
  #  threshold for the farthest point, need to tune by scene size
  near-plane: 0.4
  grad_color_factor: 1.0
  grad_s_factor: 1.0
  grad_q_factor: 1.0
  grad_alpha_factor: 1.0
summary-writer-log-dir: /opt/ml/output/data/tensorboard
output-model-dir: /opt/ml/model
train-dataset-json-path: '/opt/ml/input/data/training/train.json'
val-dataset-json-path: '/opt/ml/input/data/training/val.json'
val-interval: 3000

for other suggestions, the training/inference speed is highly related to the camera resolution and the number of points per pixel, the boots demo is 1920x1080 and the truck demo is about 1000x1000, I've tried 4K training but it is super slow, maybe do super-resolution by extra CNN is a better idea. For the number of points per pixel, it is ploted in tensorboard, if you found your training is slow due to a large number of points per pixel, try splitting your scene into multiple smaller scenes and using sphere to fit background may be a better idea.

HrsPythonix commented 1 year ago

I haven't tested it on large scenes yet, but someone did tell me it works on larger scenes.

I can provide some further explanation for the current parameters:

adaptive-controller-config:
  # The threshold on the gradient of point position in the camera plane for all affected pixels
  # decrease it->more points, slower train/render speed, better quality
  densification-view-space-position-gradients-threshold: 6e-6 

  # Similar, but average by num of pixels affected, not really used
  densification_view_avg_space_position_gradients_threshold: 1e3

  # Similar, but averaged by all frames, not really used
  densification_multi_frame_view_space_position_gradients_threshold: 1e3

  # Similar, but averaged by all frames and pixels, not really used
  densification_multi_frame_view_pixel_avg_space_position_gradients_threshold: 4e3

  # # Similar, but on the gradient of the position of a point in 3D space, not really used. In theory, it shall add more points to the background and improve background quality, but I haven't found a good value for it.
  densification_multi_frame_position_gradients_threshold: 1e3
  gaussian-split-factor-phi: 1.6

  # The frequency to split/clone points
  num-iterations-densify: 100

  # The frequency to reset all point opacity, if you have more images, maybe you shall increase this value, otherwise points may still be transparent.
  num-iterations-reset-alpha: 4000

  # Wait some iterations before begin to densify/removing transparent points, shall increase this value for a larger scene
  num-iterations-warm-up: 500
  reset-alpha-value: -1.9
  transparent-alpha-threshold: -2.0

  # Floater removal is not really used here, if you see floater during training, try decrease it
  floater_num_pixels_threshold: 400000
  floater_near_camrea_num_pixels_threshold: 300000
  iteration_start_remove_floater: 2000
  under_reconstructed_num_pixels_threshold: 32
  enable_sample_from_point: True
gaussian-point-cloud-scene-config:
  # The upper bound for num of points is max-num-points-ratio * initial_num_of_points, if you are using Colmap, then 10 shall be good, the demo truck scene contains 430k points at last.
  max-num-points-ratio: 10.0
  num-of-features: 56

  # Whether add a sphere as background, no effect for the demo scene, if your scene contains a lot of sky, leave it as True.
  add_sphere: True
  initial_alpha: 0.05
  max_initial_covariance: 3000.0
  initial_covariance_ratio: 0.1
increase-color-max-sh-band-interval: 1000.0
log-image-interval: 2000
log-loss-interval: 10
log-metrics-interval: 100
print-metrics-to-console: True
enable_taichi_kernel_profiler: True
log_taichi_kernel_profile_interval: 3000
log_validation_image: True
feature_learning_rate: 0.005
position_learning_rate: 0.00005
position_learning_rate_decay_rate: 0.9947
position_learning_rate_decay_interval: 100
loss-function-config:
  lambda-value: 0.2
  enable_regularization: False
  regularization_weight: 0.005
num-iterations: 30001
pointcloud-parquet-path: '/opt/ml/input/data/training/point_cloud.parquet'
rasterisation-config:

  # depth of points needs to be converted into an integer for radix sort, 
  # sort key is computed as int(depth-to-sort-key-scale*depth), you can increase it, but make sure depth fits in int32
  depth-to-sort-key-scale: 10.0
  # threshold for the farthest point, can be very large, just make sure depth-to-sort-key-scale * far-plane fits in int32
  far-plane: 2000.0
  #  threshold for the farthest point, need to tune by scene size
  near-plane: 0.4
  grad_color_factor: 1.0
  grad_s_factor: 1.0
  grad_q_factor: 1.0
  grad_alpha_factor: 1.0
summary-writer-log-dir: /opt/ml/output/data/tensorboard
output-model-dir: /opt/ml/model
train-dataset-json-path: '/opt/ml/input/data/training/train.json'
val-dataset-json-path: '/opt/ml/input/data/training/val.json'
val-interval: 3000

for other suggestions, the training/inference speed is highly related to the camera resolution and the number of points per pixel, the boots demo is 1920x1080 and the truck demo is about 1000x1000, I've tried 4K training but it is super slow, maybe do super-resolution by extra CNN is a better idea. For the number of points per pixel, it is ploted in tensorboard, if you found your training is slow due to a large number of points per pixel, try splitting your scene into multiple smaller scenes and using sphere to fit background may be a better idea.

Appreciated! I will try them to see whether if they work

wanmeihuali / taichi_3d_gaussian_splatting

Any suggestion on parameter tuning for large-scale dataset/scene #84