NVlabs / neuralangelo

Official implementation of "Neuralangelo: High-Fidelity Neural Surface Reconstruction" (CVPR 2023)
https://research.nvidia.com/labs/dir/neuralangelo/
Other
4.38k stars 388 forks source link

My test results on LEGO #106

Open Dragonkingpan opened 1 year ago

Dragonkingpan commented 1 year ago

I tested the LEGO data set and ran 10000Epoch on the A6000 machine to obtain a model of py with a resolution of 2048. The model size is 480MB (very large), which overall feels good, but I cannot determine whether the effect mentioned in the paper has been achieved. My training is automatically stopped, indicating that the maximum number of training sessions has been reached. I want to know if continuing training can lead to better results. Because I also tested the results of 6000ep, it can be seen to the naked eye that the results of 10000ep are significantly better than those of 6k. e10000-2048 e10000-2048-2 e10000-2048-3 e10000-2048-4 e10000-2048-5 e10000-2048-6

maomaocun commented 1 year ago

666

maomaocun commented 1 year ago

it looks very nice,though I have not completed it

peipeiguo commented 1 year ago

I feel the quality of result is not good, I also want to know how can we make it better? This is my test result of lego: image

mli0603 commented 1 year ago

I am looking into potential bugs in the pipeline that may have also caused several other issues reported. I will update once confirmed.

peipeiguo commented 1 year ago

I am looking into potential bugs in the pipeline that may have also caused several other issues reported. I will update once confirmed.

Thanks. Waiting for your good news.

DerKleineLi commented 1 year ago

The lego sequence from the tutorial is generated by NeRF, so it's not surprising to get a blurry result: image

If you train with the original nerf-synthetic data, it would look much better: image

DiamondGlassDrill commented 1 year ago

DerKleineL

What was your training time for each model? And what machine did you use (1 GPU?)

DerKleineLi commented 1 year ago

DerKleineL

What was your training time for each model? And what machine did you use (1 GPU?)

Both trained for 400K iterations locally, which takes 1 day on 1 rtx_a6000.

To reproduce the first result, follow the colab tutorial on your local machine and train with

torchrun --nproc_per_node=1 train.py \
    --logdir=logs/{GROUP}/{NAME} \
    --show_pbar \
    --config=projects/neuralangelo/configs/custom/lego.yaml \
    --data.readjust.scale=0.5 \
    --validation_iter=99999999 \

To reproduce the second result, download the official nerf dataset and modify the transforms.json as:

{
    "camera_angle_x": 0.6911112070083618,
    "camera_angle_y": 0.6911112070083618,
    "fl_x": 1111.1110311937682,
    "fl_y": 1111.1110311937682,
    "sk_x": 0.0,
    "sk_y": 0.0,
    "k1": 0.0,
    "k2": 0.0,
    "k3": 0.0,
    "k4": 0.0,
    "p1": 0.0,
    "p2": 0.0,
    "is_fisheye": false,
    "cx": 400,
    "cy": 400,
    "w": 800,
    "h": 800,
    "aabb_scale": 4.0,
    "aabb_range": [
        [
            -1.8305749018288313,
            1.8305749018288313
        ],
        [
            -1.8305749018288313,
            1.8305749018288313
        ],
        [
            -1.8305749018288313,
            1.8305749018288313
        ]
    ],
    "sphere_center": [
        0.0,
        0.0,
        0.0
    ],
    "sphere_radius": 1.8305749018288313, (--data.readjust.scale=0.5 not needed for training)
    "frames": [
        {
            "file_path": "./train/r_0.png", (add ".png" for all frames)
...
mli0603 commented 1 year ago

That's very good insight @DerKleineLi!! Thanks for sharing!

I would like to also add that the tutorial uses COLMAP to obtain the poses instead of using the GT poses. The goal is to get people familiar with our suggested pipeline. COLMAP cannot fully recover the GT focal length and camera poses. Enabling optimization for these parameters could alleviate the problem, which is an useful feature to add to our repo :) Any PR is welcome!

maomaocun commented 1 year ago

I want to how to make the model colorful?

mli0603 commented 1 year ago

Hi @maomaocun

You can follow the README and use --textured to extract textured mesh.

chenhsuanlin commented 1 year ago

In our toy example and the Colab, we use the test set of the Lego sequence instead of the training set. This is to simulate a smooth camera trajectory that is amenable with the COLMAP sequential matcher for video sequences. The test set sequence does not cover the viewpoints as wide as the training set. Therefore, the results from @DerKleineLi are expected.

qhdqhd commented 1 year ago

这是非常好的洞察力@DerKleineLi!! 感谢分享!

我还想补充一点,本教程使用 COLMAP 来获取姿势,而不是使用 GT 姿势。目标是让人们熟悉我们建议的管道。COLMAP 无法完全恢复 GT 焦距和相机姿势。启用这些参数的优化可以缓解这个问题,这是一个可以添加到我们的存储库中的有用功能:) 欢迎任何 PR!

Does this mean that neuralangelo is very sensitive to calibration noise?

qhdqhd commented 1 year ago

That's very good insight @DerKleineLi!! Thanks for sharing!

I would like to also add that the tutorial uses COLMAP to obtain the poses instead of using the GT poses. The goal is to get people familiar with our suggested pipeline. COLMAP cannot fully recover the GT focal length and camera poses. Enabling optimization for these parameters could alleviate the problem, which is an useful feature to add to our repo :) Any PR is welcome!

Does this mean that neuralangelo is very sensitive to calibration noise?

zhj1013 commented 11 months ago

DerKleineL

What was your training time for each model? And what machine did you use (1 GPU?)

Both trained for 400K iterations locally, which takes 1 day on 1 rtx_a6000.

To reproduce the first result, follow the colab tutorial on your local machine and train with

torchrun --nproc_per_node=1 train.py \
    --logdir=logs/{GROUP}/{NAME} \
    --show_pbar \
    --config=projects/neuralangelo/configs/custom/lego.yaml \
    --data.readjust.scale=0.5 \
    --validation_iter=99999999 \

To reproduce the second result, download the official nerf dataset and modify the transforms.json as:

{
    "camera_angle_x": 0.6911112070083618,
    "camera_angle_y": 0.6911112070083618,
    "fl_x": 1111.1110311937682,
    "fl_y": 1111.1110311937682,
    "sk_x": 0.0,
    "sk_y": 0.0,
    "k1": 0.0,
    "k2": 0.0,
    "k3": 0.0,
    "k4": 0.0,
    "p1": 0.0,
    "p2": 0.0,
    "is_fisheye": false,
    "cx": 400,
    "cy": 400,
    "w": 800,
    "h": 800,
    "aabb_scale": 4.0,
    "aabb_range": [
        [
            -1.8305749018288313,
            1.8305749018288313
        ],
        [
            -1.8305749018288313,
            1.8305749018288313
        ],
        [
            -1.8305749018288313,
            1.8305749018288313
        ]
    ],
    "sphere_center": [
        0.0,
        0.0,
        0.0
    ],
    "sphere_radius": 1.8305749018288313, (--data.readjust.scale=0.5 not needed for training)
    "frames": [
        {
            "file_path": "./train/r_0.png", (add ".png" for all frames)
...

@DerKleineLi Thank you very much for your valuable insights; they have been very helpful. Regarding the use of the original nerf-synthetic dataset, I would also like to ask how to replicate the same process on custom data.

DerKleineLi commented 11 months ago

DerKleineL

What was your training time for each model? And what machine did you use (1 GPU?)

Both trained for 400K iterations locally, which takes 1 day on 1 rtx_a6000. To reproduce the first result, follow the colab tutorial on your local machine and train with

torchrun --nproc_per_node=1 train.py \
    --logdir=logs/{GROUP}/{NAME} \
    --show_pbar \
    --config=projects/neuralangelo/configs/custom/lego.yaml \
    --data.readjust.scale=0.5 \
    --validation_iter=99999999 \

To reproduce the second result, download the official nerf dataset and modify the transforms.json as:

{
    "camera_angle_x": 0.6911112070083618,
    "camera_angle_y": 0.6911112070083618,
    "fl_x": 1111.1110311937682,
    "fl_y": 1111.1110311937682,
    "sk_x": 0.0,
    "sk_y": 0.0,
    "k1": 0.0,
    "k2": 0.0,
    "k3": 0.0,
    "k4": 0.0,
    "p1": 0.0,
    "p2": 0.0,
    "is_fisheye": false,
    "cx": 400,
    "cy": 400,
    "w": 800,
    "h": 800,
    "aabb_scale": 4.0,
    "aabb_range": [
        [
            -1.8305749018288313,
            1.8305749018288313
        ],
        [
            -1.8305749018288313,
            1.8305749018288313
        ],
        [
            -1.8305749018288313,
            1.8305749018288313
        ]
    ],
    "sphere_center": [
        0.0,
        0.0,
        0.0
    ],
    "sphere_radius": 1.8305749018288313, (--data.readjust.scale=0.5 not needed for training)
    "frames": [
        {
            "file_path": "./train/r_0.png", (add ".png" for all frames)
...

@DerKleineLi Thank you very much for your valuable insights; they have been very helpful. Regarding the use of the original nerf-synthetic dataset, I would also like to ask how to replicate the same process on custom data.

No idea what your custom data is, but if it's a object orbiting video you should make it following the colab tutorial.

rebecca-lay3rs commented 10 months ago

DerKleineL

What was your training time for each model? And what machine did you use (1 GPU?)

Both trained for 400K iterations locally, which takes 1 day on 1 rtx_a6000.

To reproduce the first result, follow the colab tutorial on your local machine and train with

torchrun --nproc_per_node=1 train.py \
    --logdir=logs/{GROUP}/{NAME} \
    --show_pbar \
    --config=projects/neuralangelo/configs/custom/lego.yaml \
    --data.readjust.scale=0.5 \
    --validation_iter=99999999 \

To reproduce the second result, download the official nerf dataset and modify the transforms.json as:

{
    "camera_angle_x": 0.6911112070083618,
    "camera_angle_y": 0.6911112070083618,
    "fl_x": 1111.1110311937682,
    "fl_y": 1111.1110311937682,
    "sk_x": 0.0,
    "sk_y": 0.0,
    "k1": 0.0,
    "k2": 0.0,
    "k3": 0.0,
    "k4": 0.0,
    "p1": 0.0,
    "p2": 0.0,
    "is_fisheye": false,
    "cx": 400,
    "cy": 400,
    "w": 800,
    "h": 800,
    "aabb_scale": 4.0,
    "aabb_range": [
        [
            -1.8305749018288313,
            1.8305749018288313
        ],
        [
            -1.8305749018288313,
            1.8305749018288313
        ],
        [
            -1.8305749018288313,
            1.8305749018288313
        ]
    ],
    "sphere_center": [
        0.0,
        0.0,
        0.0
    ],
    "sphere_radius": 1.8305749018288313, (--data.readjust.scale=0.5 not needed for training)
    "frames": [
        {
            "file_path": "./train/r_0.png", (add ".png" for all frames)
...

Could you explain how you got the good .json parameters (camera poses)? The original nerf-synthetic data .json does not contain all these info and I would like to replicate this pipeline on custom data (using images as input).