ayaanzhaque / instruct-nerf2nerf

Instruct-NeRF2NeRF: Editing 3D Scenes with Instructions (ICCV 2023)
https://instruct-nerf2nerf.github.io/
MIT License
789 stars 69 forks source link

Poor quality when optimized to 15000 steps #101

Open fkcptlst opened 1 month ago

fkcptlst commented 1 month ago

Hi, I'm using the latest version of in2n (2c0d385) and Nerfstudio==1.1.3. I've also encountered similar issues as described in #60.

I used the following command to edit the face scene:

ns-train in2n --data datasets/face-processed --load-diroutputs/face-processed/nerfacto/2024-08-05_120014/nerfstudio_models --pipeline.prompt "Turn him into a clown"  --max-num-iterations 15000 nerfstudio-data --downscale-factor 2

The camera optimizer config is as follows:

       'camera_opt': {
            'optimizer': AdamOptimizerConfig(
                _target=<class 'torch.optim.adam.Adam'>,
                lr=0.0006,
                eps=1e-08,
                max_norm=None,
                weight_decay=0.01
            ),
            'scheduler': ExponentialDecaySchedulerConfig(
                _target=<class 'nerfstudio.engine.schedulers.ExponentialDecayScheduler'>,
                lr_pre_warmup=1e-08,
                lr_final=6e-06,
                warmup_steps=0,
                max_steps=5000,
                ramp='cosine'
            )
        }

And I'm pretty sure the images are scaled properly, as I've printed the shape of the input images (torch.Size([1, 3, 364, 493])).

I trained the original scene for 30000 steps, and the rendered results are fine.

This is the edit result I get when using in2n (note how the quality deteriorate after more steps of training):

image image image image

image

I wonder if this is normal?

I noticed in #60 where the bear scene is trained for only 2k steps and still obtained good results. Should I decrease the default setting of maximum training steps? Would you be so kind to offer some advice? Thanks!

fkcptlst commented 1 month ago

Additionally, this is the bear scene (30000 steps for nerf pretraining, 15000 steps for in2n editing):

image

But the results looked okay at early steps (~2k steps)