Demo gives bad output - Githubissues

saimouli commented 1 year ago

Dense reconstruction results are not good is this expected for monocular case?

./build/mono_tum Vocabulary/ORBvoc.txt configs/Monocular/TUM/freiburg3_office.yaml ../../datasets/rgbd_dataset_freiburg3_long_office_household

Loading ORB Vocabulary. This could take a while...
Vocabulary loaded!

{
    "aabb_scale": 4,
    "cx": 320.1000061035156,
    "cy": 247.60000610351563,
    "fl_x": 535.4000244140625,
    "fl_y": 539.2000122070313,
    "h": 480,
    "k1": 0.0,
    "k2": 0.0,
    "offset": [
        0.5,
        0.6000000238418579,
        -0.5
    ],
    "p1": 0.0,
    "p2": 0.0,
    "scale": 0.33000001311302185,
    "w": 640
}
11:12:28 INFO     Loading NeRF dataset from
11:12:28 WARNING    Orbeez-SLAM.json does not contain any frames. Skipping.
11:12:28 WARNING  No training images were found for NeRF training (Should be using SLAM mode)!
11:12:28 INFO     Loading network config from: ./Thirdparty/instant-ngp-kf/configs/nerf/base.json
11:12:28 INFO     GridEncoding:  Nmin=16 b=1.51572 F=2 T=2^19 L=16
11:12:28 INFO     Density model: 3--[HashGrid]-->32--[FullyFusedMLP(neurons=64,layers=3)]-->1
11:12:28 INFO     Color model:   3--[Composite]-->16+16--[FullyFusedMLP(neurons=64,layers=4)]-->3
11:12:28 INFO       total_encoding_params=13074912 total_network_params=10240

Camera Parameters: 
- fx: 535.4
- fy: 539.2
- cx: 320.1
- cy: 247.6
- k1: 0
- k2: 0
- p1: 0
- p2: 0
- fps: 30
- color order: RGB (ignored if grayscale)

ORB Extractor Parameters: 
- Number of Features: 1000
- Scale Levels: 8
- Scale Factor: 1.2
- Initial Fast Threshold: 20
- Minimum Fast Threshold: 7

-------
Start processing sequence ...
Images in the sequence: 2585

Initialize failed
Initialize failed
Initialize failed
Initialize failed
Initialize failed
Initialize failed
Initialize failed
Initialize failed
Initialize failed
Initialize failed
Initialize failed
Initialize failed
Initialize failed
Initialize failed
Initialize failed
Initialize failed
Initialize failed
Initialize failed
Initialize failed
Initialize failed
Initialize failed
Initialize failed
Initialize failed
Initialize failed
Initialize failed
Initialize failed
Initialize failed
Initialize failed
Initialize failed
Initialize failed
[Tracking] initialize F and H success!
11:12:32 INFO     iteration=1 loss=0.0294167
[Tracking] New Map created with 150 points
[Optimizer] BundleAdjustment

demo

saimouli commented 1 year ago

MONO_TUM_rgbd_dataset_freiburg3_long_office_household_KeyFrameTrajectory

jeff999955 commented 1 year ago

Hi @saimouli, thank you for your issue.

We have tried to execute the program and found out that the result might vary a lot on different machines, and the result is fair on 2080 Ti while of high quality on 3070 Ti. Would you provide the model of your GPU? Thanks.

jeff999955 commented 1 year ago

FYI, here's the rendered output on 3070 Ti machine when the training phase of SLAM has completed.

MarvinChung commented 1 year ago

I didn't expect that in monocular cases. It should be both ok when using RTX 3090 (in the paper). I test the monocular case on RTX 2080 Ti and the result seems similar to yours.

However, RGB-D seems well on RTX 2080 Ti.

You can see that the loss of monocular is at around 0.017 and the loss of RGB-D converges to 0.0035 on my RTX 2080Ti machine.

@jeff999955 told me that the loss of monocular converges to 0.006 on his RTX 3070. It seems that monocular requires more training resource, but I did not expect that GPU matters that much.

MarvinChung commented 1 year ago

This should be normal in monocular case, because we don't know the scale between SLAM in the real world in monocular case. It is called ``Scale ambiguity" in monocular SLAM. Given two RGB images, you can estimate the rotation but you can't get the real translation between two images. You can define the translation of the first two images as 1 unit and use it for the rest of SLAM system, but you don't know if it is 1 cm or 1 m in the real world. This is because when solving the essential matrix, by epipolar constraint: $x_{1} = [u_1, v_1, 1]^{T}, x_2=[u_2, v2, 1]^{T}$ $x{2}^{T} E x_{1} = 0.$ No matter the scale of essential matrix, is still satisfies the epipolar constraint. You need depth to get the correct scale.

saimouli commented 1 year ago

oie_1doeKxVlN6Dk oie_KfyzNh870njO

Tried the rgbd and mono demo on NVIDIA RTX A4500 and cannot be able to render the scene on instant ngp. Wonder why would GPU matter that much!

However, Instant ngp and nerf slam is able to run normally on this GPU though

MarvinChung commented 1 year ago

I thought RTX A4500 should be good enough. Did you change your GPU but did not rebuild the project? The compilation is GPU dependent, so you need to rm build and cmake it again. If it is not the case, I will go check the code in my forked instant-ngp. It might be my forked version is not up to date.

MarvinChung commented 1 year ago

Hi @saimouli. Did you solve the problem? Does the problem relate to rebuild the project?

saimouli commented 1 year ago

I am still facing the same issue. I don't think rebuilding is the issue. I have built the entire project and cannot get proper output when I run ./run_fox script on your instant ngp branch

xmlyqing00 commented 11 months ago

Hi,

I got similar bad results on 4090. I tried both local installation and Docker. I also tried on a 3090 machine. I got similar blur results in the demo video. The loss gets stuck at 0.15. Does the author have any idea how to set the parameters?

Thanks, Yongqing

chenxu-tang commented 10 months ago

Hi, I also encountered similar bad results when running on RTX A6000, I have noticed the following warning during runtime. I am not sure if this type of problem is caused by it. Can the author also receive this warning? How can I eliminate the warning? Thanks

MarvinChung commented 10 months ago

The warning is normal in the beginning. I have updated the Docker.md, and you may need to modify the Dockerfile to make it compatible with your CUDA version and architecture.

MarvinChung / Orbeez-SLAM

Demo gives bad output #3