seoha-kim / Sync-NeRF

[AAAI 2024] Official repository for "Sync-NeRF: Generalizing Dynamic NeRFs to Unsynchronized Videos"
53 stars 1 forks source link

I try the video dataset ,I find that we need to process the video to "frames_1" directiory or something #6

Closed linmi1 closed 3 months ago

linmi1 commented 3 months ago

this is the error log: /home/vision/work/anaconda3/envs/sync-NeRF/lib/python3.8/site-packages/torch/storage.py:11: UserWarning: The NumPy module was reloaded (imported a second time). This can in some cases result in small but subtle issues and is discouraged. import numpy as np Namespace(L1_weight_inital=0.0, L1_weight_rest=0, N_vis=-1, N_voxel_final=262144000, N_voxel_init=16777216, Ortho_weight=0.0, TV_dynamic_factor=3.0, TV_loss_end_iteration=100000, TV_weight_app=1.0, TV_weight_density=1.0, accumulate_decay=0.998, activation='relu', add_timestamp=0, alpha_mask_thre=1e-05, amp=1, basedir='./logs', batch_factor=[8.0, 8.0, 2.0, 1.0], batch_size=128, cam_offset=True, camoffset_lr_factor=0.5, chromakey=False, ckpt=None, config='/home/vision/work/linmi/Sync-NeRF-master/Sync-MixVoxels/configs/plenoptic/cook_spinach.txt', data_dim_color=27, data_dim_density=27, datadir='/home/vision/work/linmi/Sync-NeRF-master/Sync-MixVoxels/video_downsample', dataset_name='llffvideo', dense_alpha=0, densityMode='None', density_shift=-10, diffuse_kernel=0, distance_scale=25, downsample_test=1.0, downsample_train=2.0, dy_loss='l2', dynamic_granularity='point_wise', dynamic_only_ray_start_iteration=-1, dynamic_pool_kernel_size=21, dynamic_reg_weight=0.1, dynamic_threshold=0.9, dynamic_use_volumetric_render=0, dynamic_weight_decay=0.0, expname='cook_spinach', export_mesh=0, far=1.0, fea2denseAct='relu', fea_pe=0, featureC=512, featureD=512, filter_loss_weight=0, filter_threshold=1.0, frame_start=0, gamma_end=0.02, gamma_start=0.02, gaussian=0, idx_view=0, init_dynamic_a=-0.1, init_dynamic_b=0.1, init_dynamic_mean=0.0, init_dynamic_std=0.1, init_dynamic_voxel='none', init_static_a=-0.1, init_static_b=0.1, init_static_mean=0.0, init_static_std=0.1, init_static_voxel='none', interpolation='bilinear', l1gamma=0.001, l1loss=True, lindisp=False, loss_weight_static=1.0, loss_weight_thresh_end=0.0, loss_weight_thresh_start=0.0, lr_basis=0.002, lr_decay_iters=-1, lr_decay_target_ratio=0.1, lr_dynamic_basis=0.002, lr_dynamic_init=0.03, lr_init=0.03, lr_upsample_reset=1, meta_config=None, model_name='TensorVMSplit', nSamples=1000000.0, n_dynamic_iters=2000, n_frame_for_static=2, n_frames=270, n_iters=50000, n_iters_pre=10000, n_iters_test=500, n_lamb_sh=[48, 12, 12], n_lamb_sh_dynamic=None, n_lamb_sigma=[16, 4, 4], n_lamb_sigma_dynamic=None, n_layer=2, n_time_embedding=150, n_train_frames=270, ndc_ray=1, near=0.0, net_layer_add=0, netspec_dy_color='i-d-d-o', netspec_dy_density='i-d-d-o', no_load_timequery=False, offset_start_iters=0, optimizer='adam', perturb=1.0, point_wise_dynamic_threshold=0.01, pos_pe=6, progress_refresh_rate=10, ray_sampler='simple', ray_sampler_shift=3000, ray_weight_gamma=1, ray_weighted=0, rel_pos_pe=6, remove_foreground=0, render_only=0, render_path=0, render_path_start=0, render_test=1, render_train=0, render_views=120, rgb_diff_log_thresh=0.2, rgb_diff_weight=0.0, rm_weight_mask_thre=0.0001, rm_weight_mask_thre_static=1e-05, scene_box=[-3.0, -3.0, -1.5], shadingMode='MLP_Fea', shift_std=-1, sigma_decay=0.0, sigma_decay_method='l2', sigma_decay_static=0.0, sigma_diff_log_thresh=0.05, sigma_diff_method='log', sigma_diff_weight=0.0, sigma_entropy_weight=0, sigma_entropy_weight_static=0, sigma_static_thresh=0.1, simple_sample_weight=0, simple_sample_weight_end=0, sparsity_lambda=1.0, ssd_dir='./data/llff/fern', static_branch_only_initial=0, static_dynamic_seperate=1, static_featureC=128, static_loss='l2', static_point_detach=1, static_type='mean', step_ratio=4.0, temperature_end=0.2, temperature_start=10.0, temporal_sampler='simple', temporal_sampler_method='mean', temporal_sampler_replace=1, temporal_variance_threshold=0.02, test_cam_offset=0, test_optim=False, time_freq=10, time_head='timemlprender', time_head_pre='directdyrender', timemlp_lr_factor=5, update_AlphaMask_list=[2500], update_stepratio=None, update_stepratio_iters=None, upsamp_list=[2000, 3000, 4000, 5500], use_cosine_lr_scheduler=1, view_pe=0, vis_every=10000, voxel_init_dynamic=0.1, voxel_init_static=0.1, with_depth=False, zero_dynamic_sigma=1, zero_dynamic_sigma_thresh=1e-05) train from-scratch Traceback (most recent call last): File "/home/vision/work//Sync-NeRF-master/Sync-MixVoxels/train.py", line 1324, in reconstruction(args) File "/home/vision/work/Sync-NeRF-master/Sync-MixVoxels/train.py", line 231, in reconstruction train_dataset = dataset(args.datadir, split='train', downsample=args.downsample_train, is_stack=False, white_bg = args.chromakey, File "/home/vision/work/Sync-NeRF-master/Sync-MixVoxels/dataLoader/llff_video.py", line 295, in init self.read_meta(no_load, optimize_test) File "/home/vision/work/Sync-NeRF-master/Sync-MixVoxels/dataLoader/llff_video.py", line 320, in read_meta assert len(poses_bounds) == len(self.video_paths), \ AssertionError: Mismatch between number of images and number of poses! Please rerun COLMAP!

linmi1 commented 3 months ago

I find that if we use the blender dataset ,there maybe not any problem ,but I used the plenoptic dataset

seoha-kim commented 3 months ago

It is not an issue with my code but with the original code.

Anyway, check the number of files starting with 'cam' in the specified path. An error can occur if there are folders like 'camxx' Files starting with 'cam' should be only in '.mp4' format. Put folders started with 'cam' into 'frames_2' folder.

linmi1 commented 3 months ago

It is not an issue with my code but with the original code.

Anyway, check the number of files starting with 'cam' in the specified path. An error can occur if there are folders like 'camxx' Files starting with 'cam' should be only in '.mp4' format. Put folders started with 'cam' into 'frames_2' folder.

Thank you! I put the cam00-cam20 inside the foder frames_2 .It looks OK ,and it's my foder shape ---frames_2 ------cam00 --------cam00.mp4 ------cam01 --------cam01.mp4 ------cam02 --------cam02.mp4 .....

however,I get a new mistake about std_path folder ,I don't know how to get the data about std_path It 's the error log : File "/home/vision/work/linmi/Sync-NeRF-master/Sync-MixVoxels/train.py", line 1324, in reconstruction(args) File "/home/vision/work/linmi/Sync-NeRF-master/Sync-MixVoxels/train.py", line 231, in reconstruction train_dataset = dataset(args.datadir, split='train', downsample=args.downsample_train, is_stack=False, white_bg = args.chromakey, File "/home/vision/work/linmi/Sync-NeRF-master/Sync-MixVoxels/dataLoader/llff_video.py", line 295, in init self.read_meta(no_load, optimize_test) File "/home/vision/work/linmi/Sync-NeRF-master/Sync-MixVoxels/dataLoader/llff_video.py", line 385, in read_meta assert os.path.isfile(std_path) AssertionError

linmi1 commented 3 months ago

Maybe it need camXX_std.npy

seoha-kim commented 3 months ago

The folder tree should be as follows: (This follows the same structure as the original Mixvoxels repository.)

image

image

Please use ffmpeg or similar tools to convert .mp4 files into .png files with a frame rate of 30fps. Place the .png files in the subdirectories of frames_2/camxx/

After you run the code, stds will be generated. If the number of stds files does not match the number of videos, an error will occur. In that case, please delete the stds folder and recreate it.

linmi1 commented 3 months ago

The folder tree should be as follows: (This follows the same structure as the original Mixvoxels repository.)

image

image

Please use ffmpeg or similar tools to convert .mp4 files into .png files with a frame rate of 30fps. Place the .png files in the subdirectories of frames_2/camxx/

After you run the code, stds will be generated. If the number of stds files does not match the number of videos, an error will occur. In that case, please delete the stds folder and recreate it.

Thanks for your reply !!!!!!!!!!!!!!!!!!!!! I have put the images like what you said. I have got the cam00_std.npy to cam20_std.npy ,However the cpu overflow .After that I want to downsaple the images but get the dimension issue,Also , if I want to use the NeRF in my own dataset ,It will face the same question.I wonder how to solve it,if there is any parameters to control? !!!(I downsample the images to 192*256) It is the log:

Namespace(L1_weight_inital=0.0, L1_weight_rest=0, N_vis=-1, N_voxel_final=262144000, N_voxel_init=16777216, Ortho_weight=0.0, TV_dynamic_factor=3.0, TV_loss_end_iteration=100000, TV_weight_app=1.0, TV_weight_density=1.0, accumulate_decay=0.998, activation='relu', add_timestamp=0, alpha_mask_thre=1e-05, amp=1, basedir='./logs', batch_factor=[8.0, 8.0, 2.0, 1.0], batch_size=128, cam_offset=True, camoffset_lr_factor=0.5, chromakey=False, ckpt=None, config='/home/vision/work/linmi/Sync-NeRF-master/Sync-MixVoxels/configs/plenoptic/cook_spinach.txt', data_dim_color=27, data_dim_density=27, datadir='/home/vision/work/linmi/Sync-NeRF-master/Sync-MixVoxels/dataset/video_downsample', dataset_name='llffvideo', dense_alpha=0, densityMode='None', density_shift=-10, diffuse_kernel=0, distance_scale=25, downsample_test=1.0, downsample_train=1.0, dy_loss='l2', dynamic_granularity='point_wise', dynamic_only_ray_start_iteration=-1, dynamic_pool_kernel_size=21, dynamic_reg_weight=0.1, dynamic_threshold=0.9, dynamic_use_volumetric_render=0, dynamic_weight_decay=0.0, expname='cook_spinach', export_mesh=0, far=1.0, fea2denseAct='relu', fea_pe=0, featureC=512, featureD=512, filter_loss_weight=0, filter_threshold=1.0, frame_start=0, gamma_end=0.02, gamma_start=0.02, gaussian=0, idx_view=0, init_dynamic_a=-0.1, init_dynamic_b=0.1, init_dynamic_mean=0.0, init_dynamic_std=0.1, init_dynamic_voxel='none', init_static_a=-0.1, init_static_b=0.1, init_static_mean=0.0, init_static_std=0.1, init_static_voxel='none', interpolation='bilinear', l1gamma=0.001, l1loss=True, lindisp=False, loss_weight_static=1.0, loss_weight_thresh_end=0.0, loss_weight_thresh_start=0.0, lr_basis=0.002, lr_decay_iters=-1, lr_decay_target_ratio=0.1, lr_dynamic_basis=0.002, lr_dynamic_init=0.03, lr_init=0.03, lr_upsample_reset=1, meta_config=None, model_name='TensorVMSplit', nSamples=1000000.0, n_dynamic_iters=2000, n_frame_for_static=2, n_frames=270, n_iters=50000, n_iters_pre=10000, n_iters_test=500, n_lamb_sh=[48, 12, 12], n_lamb_sh_dynamic=None, n_lamb_sigma=[16, 4, 4], n_lamb_sigma_dynamic=None, n_layer=2, n_time_embedding=150, n_train_frames=270, ndc_ray=1, near=0.0, net_layer_add=0, netspec_dy_color='i-d-d-o', netspec_dy_density='i-d-d-o', no_load_timequery=False, offset_start_iters=0, optimizer='adam', perturb=1.0, point_wise_dynamic_threshold=0.01, pos_pe=6, progress_refresh_rate=10, ray_sampler='simple', ray_sampler_shift=3000, ray_weight_gamma=1, ray_weighted=0, rel_pos_pe=6, remove_foreground=0, render_only=0, render_path=0, render_path_start=0, render_test=1, render_train=0, render_views=120, rgb_diff_log_thresh=0.2, rgb_diff_weight=0.0, rm_weight_mask_thre=0.0001, rm_weight_mask_thre_static=1e-05, scene_box=[-3.0, -3.0, -1.5], shadingMode='MLP_Fea', shift_std=-1, sigma_decay=0.0, sigma_decay_method='l2', sigma_decay_static=0.0, sigma_diff_log_thresh=0.05, sigma_diff_method='log', sigma_diff_weight=0.0, sigma_entropy_weight=0, sigma_entropy_weight_static=0, sigma_static_thresh=0.1, simple_sample_weight=0, simple_sample_weight_end=0, sparsity_lambda=1.0, ssd_dir='./data/llff/fern', static_branch_only_initial=0, static_dynamic_seperate=1, static_featureC=128, static_loss='l2', static_point_detach=1, static_type='mean', step_ratio=4.0, temperature_end=0.2, temperature_start=10.0, temporal_sampler='simple', temporal_sampler_method='mean', temporal_sampler_replace=1, temporal_variance_threshold=0.02, test_cam_offset=0, test_optim=False, time_freq=10, time_head='timemlprender', time_head_pre='directdyrender', timemlp_lr_factor=5, update_AlphaMask_list=[2500], update_stepratio=None, update_stepratio_iters=None, upsamp_list=[2000, 3000, 4000, 5500], use_cosine_lr_scheduler=1, view_pe=0, vis_every=10000, voxel_init_dynamic=0.1, voxel_init_static=0.1, with_depth=False, zero_dynamic_sigma=1, zero_dynamic_sigma_thresh=1e-05)
train from-scratch
1.2 167.42230716678975
camlist: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
Traceback (most recent call last):
  File "/home/vision/work/linmi/Sync-NeRF-master/Sync-MixVoxels/train.py", line 1324, in <module>
    reconstruction(args)
  File "/home/vision/work/linmi/Sync-NeRF-master/Sync-MixVoxels/train.py", line 231, in reconstruction
    train_dataset = dataset(args.datadir, split='train', downsample=args.downsample_train, is_stack=False, white_bg = args.chromakey,
  File "/home/vision/work/linmi/Sync-NeRF-master/Sync-MixVoxels/dataLoader/llff_video.py", line 295, in __init__
    self.read_meta(no_load, optimize_test)
  File "/home/vision/work/linmi/Sync-NeRF-master/Sync-MixVoxels/dataLoader/llff_video.py", line 436, in read_meta
    self.dynamic_rays = self.all_rays[dynamic_mask]
IndexError: The shape of the mask [983040] at index 0 does not match the shape of the indexed tensor [109674240, 6] at index 0
seoha-kim commented 3 months ago

I remember seeing this error about a year ago, but unfortunately, I don't remember exactly how I fixed it. I think the error is happening because you downsampled by a factor larger than 2 but didn't rename the frames_ folder to match the downsampling factor (e.g., frames_4). Rename the folder and modify the 'downsample_train' parameter in the config file accordingly. After that, please delete the stds folder and recreate it.

As I mentioned, this error is not from our code but the original MixVoxels code. Please ask the original paper repository about it for a more accurate answer. (link: https://github.com/fengres/mixvoxels) I only wrote the part of the code to address the unsynchronization (time offset, temporal mlp, etc).