LeapLabTHU / ActiveNeRF

Official repository of ActiveNeRF (ECCV2022)
98 stars 8 forks source link

Unable to reproduce ActiveNeRF-CL in Table 2 Setting 1, Synthetic scene. #9

Open hskAlena opened 1 year ago

hskAlena commented 1 year ago

Hello, I'm trying to reproduce the synthetic scene results in setting 1, but I can't. I believe the released code is ActiveNeRF-CL, right? Could you check the config settings I used? The average PSNR I got is 13...

N_importance = 128 N_rand = 512 N_samples = 64 active_iter = [40000, 80000, 120000, 160000] basedir = ./logs/materials_active beta_min = 0.01 choose_k = 4 chunk = 8192 config = configs/blender_active_20k.txt datadir = ./data/nerf_synthetic/materials dataset_type = blender ds_rate = 2 case_name = materials expname = active_materials factor = 8 ft_path = None half_res = False i_all = 200000 i_embed = 0 i_img = 500 i_print = 500 i_testset = 10000 i_video = 50000 i_weights = 20000 init_image = 4 lindisp = False llffhold = 8 lrate = 0.0005 lrate_decay = 500 multires = 10 multires_views = 4 netchunk = 16384 netdepth = 8 netdepth_fine = 8 netwidth = 256 netwidth_fine = 256 no_ndc = False no_reload = False perturb = 1.0 precrop_frac = 0.5 precrop_iters = 500 raw_noise_std = 0.0 render_factor = 4 render_only = True render_test = True spherify = False testskip = 1 use_viewdirs = True w = 0.01 white_bkgd = True

hskAlena commented 1 year ago

image

This is the rendered output at 200K step in training setting 1 (20 images). The final PSNR is 14.60 when I add weight decay in Adam optimizer and 11. 25 when I add nothing.

Similarly, I got this output in the same setting, with PSNR 18.73 in weight decaying and 13 without weight decaying. image

How can I improve these outputs to get average PSNR 26?

JiangWenPL commented 1 year ago

Hi hskAlena, we are also experiencing a similar issue. We got PSNR=25 on the chair but PSNR=15 on the lego. Did you experience a similar issue in those scenes?

hskAlena commented 1 year ago

Ohh mine is different. I got PSNR=18 on the chair and PSNR=21 on the lego. Is your setting similar to mine? Can you please share your setting? Also, I used seed 0 in the results above.

RPFey commented 1 year ago

@hskAlena How do you choose the initial views ?

hskAlena commented 1 year ago

The original code seems to use 20 initial views according to data/hotdog/transforms_train.json.

However, the paper says they used 4 initial views, so I used the same method they used in LLFF dataset. https://github.com/LeapLabTHU/ActiveNeRF/blob/83f1329c0d9c49e4e11ca1d23dd17cf184625d28/run_nerf.py#L596

Therefore, I modified transforms_train.json and used the very first 4 views, ("./train/r_0", "./train/r_1", "./train/r_2", "./train/r_3"). And I added 4~19 to transforms_holdout.json.

RPFey commented 1 year ago

OK. We use every 5 images in the first 20 images in the training split. ("./train/r_0", "./train/r_5", "./train/r_10", "./train/r_15"). I think that's the reason we produce different results.

hskAlena commented 1 year ago

@RPFey @Panxuran @LeapLabTHU How many iterations did you train the model for the evaluation? I evaluated the model at 200k iteration. Also, did you use the whole 200 test set in the synthetic scene?

RPFey commented 1 year ago

We train for 200k iterations. Test skip is 8. We pick one every 8 images in test set. The total number is 25.

RPFey commented 9 months ago

@hskAlena Hi, have you reproduced the results in the paper ?

hskAlena commented 9 months ago

Nope. I couldn't reproduce the results with the given information. However, I made the results similar to the reported one by changing the model, adjusting the hyperparameters, and changing the training scheme. I changed almost everything except the acquisition function and the uncertainty loss (hmm.. I changed the coefficients too) Haha. I don't know if I can call this "reproduced"...

georgeNakayama commented 7 months ago

Hi @RPFey, I'm experiencing the same issue with reproducing the result in the table. Could you provide the hyparams used to reproduce the result? Also it seems that the training variance among different seeds is big. Sometimes the model will fail to converge with different seeds. Is this normal? Thanks for the clarification!