NVlabs / neuralangelo

Official implementation of "Neuralangelo: High-Fidelity Neural Surface Reconstruction" (CVPR 2023)
https://research.nvidia.com/labs/dir/neuralangelo/
Other
4.38k stars 388 forks source link

Recommendations for video quality? / Bad extracted mesh #88

Open iam-machine opened 1 year ago

iam-machine commented 1 year ago

Hi ;) I was testing the pipeline with a few different videos I shot outdoors and at home. The 4k 30 fps video ended up resulting in a very long COLMAP process (like the whole evening or even more, wasn't counting hours). Hence my question - are there any limits on the video parameters that can be used for Neuralangelo? While shooting a street that has a lot of fine detail, I noticed that in FullHD the details are not as visible as in 4k, which can affect the quality of the models in the end - that's why decided to go with 4k. But now I see that processing 4k even at the COLMAP stage takes too long. Also, is there a recommendation as to what is the most optimal fps? And the length of video too - I was filming very, very slowly to minimize motion blur from phone, and since the location is large, the video ended up to be 20 minutes long. And considering that I was filming in 4k, the fact that in the end 4k video is converted to many jpg images makes 4k video being a little bit useless, no? Maybe, if the conversion would be done to png or something, the quality of pictures would be better. Just a thought in passing, interested to hear your opinion as you certainly know best how things work. If I did something very wrong, please let me know :)

mli0603 commented 1 year ago

Hi @iam-machine

Good to hear that you have found some interesting places to reconstruct :D Most of your intuitions are correct.

Both of the above will make COLMAP faster. I hope this helps!

iam-machine commented 1 year ago

@mli0603 Hi :) So, I pulled an update from github, used the latest version of Neuralangelo for the whole pipeline, and got awful results for 4k video. I thought - "Well, maybe Neuralangelo simply choked on 4k? I should try 1920x1080 then" and I did. Same street, just in 1920x1080, 30 fps. I waited for the full training to be finished, and checked only the last checkpoint that I got after 500k iters. The result was the same as when I tried with 4k video previous time, and when I tried different hyperparameters before it (that's another story... I first tried processing 4k video with hyperparameters changed to speed up the training, and got this strange box full of blobs in the end. I thought that it's because of hyperparameters and tried training at 4k with default settings. Same result. And then I thought I keep getting the same bad result because of the 4k, and tried FullHD.. and you know). That's strange, because everything was okay with lego and toy examples. But I was working with them a long time ago, and I don't know if I would have success with them now - probably it's some update that causes output models to be like that, who knows. Maybe I messed up the bounding box? I wasn't sure how small or how big it should be, I just was trying to not leave a lot of free space and not to cut off important pieces of scene. I will attach some W&B images and screenshots here.

W&B images, 1920x1080 video, 500k iters, default settings:

Screenshot 2023-09-01 232141

Meanwhile the mesh, extracted after the training was done:

Screenshot 2023-09-01 230423

Normal and render image look a little bit strange to me, they are not that detailed as I would expect them to be at 500k iters.

Screenshot 2023-09-01 232740

And the bounding box, in case it's me who is the cause of all problems here:

Screenshot 2023-08-31 222027
mli0603 commented 1 year ago

Hi @iam-machine

Based on the quality of normal maps, I would say the training fails to converge. It looks like the bounding box is fairly loose (not sure if it is intended) and can be adjusted. Without looking at the data, it is hard to tell if something else is off. Have you checked if COLMAP poses are correct?