Closed coollzd closed 3 months ago
1) what resolution are the input images at? If they are above 512, training will be very slow. You can use the downscale factor as discussed in the readme 2) You likely only need 10k iterations for the edit to converge, but the progress bar goes up until 30k iterations.
I have a similar issue. I'm trying to replicate your results from the paper. I used the bear example. The images are 994*738, I'm aware that they are these are larger than 512*512, but I wanted to use this example as it is. I'm running all this on a Nvidia L4 GPU with 24 GB VRAM. I built the initial NeRF with this command:
ns-train nerfacto --data data/bear/
And then I ran in2n like this:
ns-train in2n --data data/bear --load-dir outputs/bear/nerfacto/2024-03-06_194625/nerfstudio_models/ --pipeline.prompt "Turn the bear into a grizzly bear" --pipeline.guidance-scale 6.5 --pipeline.image-guidance-scale 1.5 --max-num-iterations 4000
Initially ETA was around 1 hr, which looked correct. Then when that timer reached 0, ETA suddenly jumped to 24 h:
I thought maybe the editing task is done, so I checked the viewer:
There is some progress but it doesn't look like the results from the paper.
Before this, I tried running this and other examples including my own image data (512*512) with 15000 and 30000 max iterations. Each time, it started with 21+ hours. In the paper, it is stated that training takes around an hour on a single Titan RTX. What am I doing wrong here?
The ETA is pretty meaningless in this case, since it does some math under the hood given how many iterations have already run. If you are training with 512 images, your edit should converge in around an hour. Just check the viewer periodically
Thanks for the quick response. I resized the input images from the bear example by half. Unfortunately, the --downscale-factor
argument does not work with the nerfacto
method and I couldn't really figure out how to scale down the images manually (like, do I also scale down the keyframes under Mar5at1-50pm-poly dir?). So I just took the already downsized images under the images_2 dir and ran COLMAP over them:
ns-process-data images --data input/data/bear_resized/ --output-dir data/bear_resized/
Then I trained a new NeRF:
ns-train nerfacto --data data/bear_resized/
Then the in2n for 4000 iterations, which took less than an hour:
ns-train in2n --data data/bear_resized/ --load-dir outputs/bear_resized/nerfacto/2024-03-07_111958/nerfstudio_models/ --pipeline.prompt "Turn the bear into a grizzly bear" --pipeline.guidance-scale 6.5 --pipeline.image-guidance-scale 1.5 --max-num-iterations 4000
The results actually look nice:
Now I will try the same with my own image sets. One question though: Are the results in the paper also generated with images with resolution lower than 512*512? If not, how long on average it took you to train them?
Thanks.
Cool! Ya for the paper we trained most of the results at something around 512. The bear dataset we just used a downscale factor of 2. Why doesn't it work? You can do add this to the end of the command:
nerfstudio-data --downscale-factor 2
Ah okay I think I misread the readme. I tried to do it like this: ns-train nerfacto --downscale-factor 2 --data data/bear
without the nerfstudio-data
part. My bad.
Great, no problem
why I need 19 hours to train in2n on two 3090?