zju3dv / EasyVolcap

[SIGGRAPH Asia 2023 (Technical Communications)] EasyVolcap: Accelerating Neural Volumetric Video Research
Other
577 stars 41 forks source link

Bad result in Neural3DV dataset #5

Closed adkAurora closed 6 months ago

adkAurora commented 6 months ago

I use Neural3DV datasetsear_steak, follow the dataset conversion scripts inneural3dv_to_easyvolcap.py to generate yml files, and train I3mhet use config in configs/exps/l3mhet/l3mhet_sear_steak.yaml, the result is so bad which val psnr is only 6.36

2023-12-19 10:44:27.029948 easyvolcap.runners.evaluators.volumetric_video_evaluator -> summarize:                                             volumetric_video_evaluator.py:79
                           {                                                                                                                                                  
                               'psnr_mean': 6.362307601504856,                                                                                                                
                               'psnr_std': 0.16340760990474038,                                                                                                               
                               'ssim_mean': 0.018442549,                                                                                                                      
                               'ssim_std': 0.0059690047,                                                                                                                      
                               'lpips_mean': 0.676931189166175,                                                                                                               
                               'lpips_std': 0.010196555702566121                                                                                                              
                           }   
dendenxu commented 6 months ago

Hi, could you please paste a sample output image here? There are several other things to check if the results does not look good.

  1. Check whether the camera parameters look reasonable: we provide a script for visualizing the cameras: scripts/tools/visualize_cameras and it should output a ply file to give a rough visualization of camera parameters.
  2. Check with inferenece based method like ENeRFi. You can directly run rendering with ENeRFi in the GUI, try tuning near, far and bounds to see if the result gets better.
  3. Train a static model to see if we can converge: by appending a configs/specs/static.yaml in your experimentation configuration (located in exps). Or just add dataloader_cfg.dataset_cfg.frame_sample=0,1,1 val_dataloader_cfg.dataset_cfg.frame_sample=0,1,1 in any command line regarding the dataset.
adkAurora commented 6 months ago

Hi ~ thanks for your reply

  1. I generated the cameras.ply, everything looks okay here. image
  2. The output result is total empty , here is theerror.png of frame one image
dendenxu commented 6 months ago

Sorry for the late reply! I double-checked the pre-processing script and found that we internally used a different conversion path (neural3dv -> nerfstudio -> easyvolcap) thus the neural3dv_to_easyvolcap script was not thoroughly tested. In my latest commit this issue should have been fixed and you should be able to train a l3mhet model on the dataset correctly after converting with neural3dv_to_easyvolcap.

I recommend checking the implementation by training on a single frame first:

# Train on the first frame
evc -c configs/exps/l3mhet/l3mhet_sear_steak.yaml,configs/specs/static.yaml exp_name=l3mhet_sear_steak_static runner_cfg.save_latest_ep=1 runner_cfg.eval_ep=1 runner_cfg.resume=False

# Render spiral path
evc -t test -c configs/exps/l3mhet/l3mhet_sear_steak.yaml,configs/specs/static.yaml,configs/specs/spiral.yaml exp_name=l3mhet_sear_steak_static val_dataloader_cfg.dataset_cfg.render_size=540,960

# Fuse depth maps for visualization
python scripts/tools/volume_fusion.py -- -c configs/exps/l3mhet/l3mhet_sear_steak.yaml,configs/specs/static.yaml exp_name=l3mhet_sear_steak_static val_dataloader_cfg.dataset_cfg.ratio=0.05

Another recommended way to check the camera parameters is to render an enerfi model on the dataset:

# Construct the experiments manually and render on GUI
evc -t gui -c configs/base.yaml,configs/models/enerfi.yaml,configs/datasets/neural3dv/sear_steak.yaml,configs/specs/vf0.yaml exp_name=enerfi_dtu model_cfg.sampler_cfg.n_planes=32,8 model_cfg.sampler_cfg.n_samples=4,1  viewer_cfg.window_size=540,960

Could you please check whether the issue has also been fixed on your end?

adkAurora commented 6 months ago

Thank you for your attention and effort~ I have tried the new code, there are some new problems.

  1. Training on a single frame l3mhet_sear_steak_static can get reasonable result with mean psnr about 35 ,but I have a new question about val render result as blow, what are the strange vertical lines inside the green box? image
  2. Training on all frames consistently fails during the initial dataset loading stage. The process of loading the entire dataset twice is not only extremely time-consuming but is also prone to termination. Do you have any suggestions on how to solve this issue? I have attempted the process three times, and it was terminated each time. I remember that in the previous version of the code, the training images were loaded just once at the start, while the validation images were loaded later on.

EasyVolcap# evc -c configs/exps/l3mhet/l3mhet_sear_steak.yaml 2023-12-20 12:55:11.475942 easyvolcap.scripts.main -> preflight: Starting experiment: l3mhet_sear_steak, command: train main.py:80 2023-12-20 easyvolca… Loading imgs bytes for neural3dv/sear_steak/images TRAIN 100% ━━━━━━━━━━ 6,300/6,3… 0:14:30 < 0:00:00 8.316 p… 13:09:42.… -> it/s
load_resi…
2023-12-20 easyvolcap.da… Caching imgs for neural3dv/sear_steak TRAIN 22% ━━╸━━━━━━━━━━ 1,414/6,300 0:06:53 < 7:30:38 0.181 it/s v… 13:16:36.386… -> load_bytes:
Killed

dendenxu commented 6 months ago

Hi @adkAurora, thanks for the follow up!

  1. The vertical line looks like the bounding box we manually defined. The NeRF based models in EasyVolcap will only sample points inside the bounding box, and the sampling behavior outside of the bbox is undefined. This should mean that we've set a bounding box that's too small. You can try tuning the bounds inside configs/datasets/neural3dv/neural3dv.yaml.
  2. Most of the time if a process is killed without warning, we're consuming too much memory (RAM). Try setting the swap size a little larger? EasyVolcap will cache the input images as jpeg bytes inside the main memory. For 20000 1K images, this should require around 20GB in my experiences.
adkAurora commented 6 months ago

problem solved, Thanks~