ayaanzhaque / instruct-nerf2nerf

Instruct-NeRF2NeRF: Editing 3D Scenes with Instructions (ICCV 2023)
https://instruct-nerf2nerf.github.io/
MIT License
769 stars 64 forks source link

train so slow #83

Closed coollzd closed 3 months ago

coollzd commented 6 months ago

why I need 19 hours to train in2n on two 3090?

ayaanzhaque commented 6 months ago

1) what resolution are the input images at? If they are above 512, training will be very slow. You can use the downscale factor as discussed in the readme 2) You likely only need 10k iterations for the edit to converge, but the progress bar goes up until 30k iterations.

umutyazgan commented 4 months ago

I have a similar issue. I'm trying to replicate your results from the paper. I used the bear example. The images are 994*738, I'm aware that they are these are larger than 512*512, but I wanted to use this example as it is. I'm running all this on a Nvidia L4 GPU with 24 GB VRAM. I built the initial NeRF with this command:

ns-train nerfacto --data data/bear/

And then I ran in2n like this:

ns-train in2n --data data/bear --load-dir outputs/bear/nerfacto/2024-03-06_194625/nerfstudio_models/ --pipeline.prompt "Turn the bear into a grizzly bear" --pipeline.guidance-scale 6.5 --pipeline.image-guidance-scale 1.5 --max-num-iterations 4000

Initially ETA was around 1 hr, which looked correct. Then when that timer reached 0, ETA suddenly jumped to 24 h: image I thought maybe the editing task is done, so I checked the viewer: render(2) render(3) render(4) There is some progress but it doesn't look like the results from the paper. Before this, I tried running this and other examples including my own image data (512*512) with 15000 and 30000 max iterations. Each time, it started with 21+ hours. In the paper, it is stated that training takes around an hour on a single Titan RTX. What am I doing wrong here?

ayaanzhaque commented 4 months ago

The ETA is pretty meaningless in this case, since it does some math under the hood given how many iterations have already run. If you are training with 512 images, your edit should converge in around an hour. Just check the viewer periodically

umutyazgan commented 4 months ago

Thanks for the quick response. I resized the input images from the bear example by half. Unfortunately, the --downscale-factor argument does not work with the nerfacto method and I couldn't really figure out how to scale down the images manually (like, do I also scale down the keyframes under Mar5at1-50pm-poly dir?). So I just took the already downsized images under the images_2 dir and ran COLMAP over them:

ns-process-data images --data input/data/bear_resized/ --output-dir data/bear_resized/

Then I trained a new NeRF:

ns-train nerfacto --data data/bear_resized/

Then the in2n for 4000 iterations, which took less than an hour:

ns-train in2n --data data/bear_resized/ --load-dir outputs/bear_resized/nerfacto/2024-03-07_111958/nerfstudio_models/ --pipeline.prompt "Turn the bear into a grizzly bear" --pipeline.guidance-scale 6.5 --pipeline.image-guidance-scale 1.5 --max-num-iterations 4000

The results actually look nice: render(5) Now I will try the same with my own image sets. One question though: Are the results in the paper also generated with images with resolution lower than 512*512? If not, how long on average it took you to train them? Thanks.

ayaanzhaque commented 4 months ago

Cool! Ya for the paper we trained most of the results at something around 512. The bear dataset we just used a downscale factor of 2. Why doesn't it work? You can do add this to the end of the command:

nerfstudio-data --downscale-factor 2

umutyazgan commented 3 months ago

Ah okay I think I misread the readme. I tried to do it like this: ns-train nerfacto --downscale-factor 2 --data data/bear without the nerfstudio-data part. My bad.

ayaanzhaque commented 3 months ago

Great, no problem