InternLandMark / LandMark

Other
447 stars 39 forks source link

Data loader for mill19 dataset. #8

Open wzpscott opened 1 year ago

wzpscott commented 1 year ago

Thank you for your excellent work. I would like to know if LandMark is compatible with the current large-scale NeRF dataset, such as the Mill19 dataset. I have noticed that the original grid-nerf work was evaluated using the Mill19 dataset. Would it be possible to release the dataloader so that we can use the provided camera poses instead of having to calculate them again using colmap?

kam1107 commented 1 year ago

Hi, we used ContextCapture to directly extract poses, which is different from MegaNeRF which uses PixSFM (with key points and bundle adjustment). We would still recommend to use their provided poses to align your experiment results. Nevertheless, here's a link to our extracted poses of the rubble scene: https://drive.google.com/file/d/15aQpVGDu7Lun9j4HJvpUZtnlTzEkUvQq/view?usp=sharing Alternatively you extract poses on your own with other software (not limited to ContextCapture).

Liumouliu commented 1 year ago

Thank you for the share. Would you please also share the conf.txt and dataloader.py, thank you very much!

Liumouliu commented 1 year ago

Hi, I did a quick experiment on the Mill19 dataset using the uploaded ContextCapture pose.

The most important variables in the configuration file are set as follows:

lb = [-34, -50, -0.8] ub = [33, 49, 0.5] downsample_train = 8 train_near_far = [1e-1, 1000] render_near_far = [2e-1, 1000]

The training goes smoothly. However, I noticed a strange phenomenon,

[rubble-pixsfm-all][rubble] Iter 005210: psnr=19.78 test=7.11 mse=0.010: 10%|███████

The validation PSNR is much lower than the training PSNR.

However, when I added the validation images back to the training set, the validation PSNR is comparable with respect to the training PSNR.

I reserved the official validation images for validation:

000000.jpg 000083.jpg 000166.jpg 000249.jpg 000332.jpg 000415.jpg 000498.jpg 000581.jpg 000664.jpg 000747.jpg 000830.jpg 000913.jpg 000996.jpg 001079.jpg 001162.jpg 001245.jpg 001328.jpg 001411.jpg 001494.jpg 001577.jpg 001660.jpg

Would you please give me some suggestions? Thank you very much!

kam1107 commented 1 year ago

Hi, can you explain how the lower and upper bound of height are decided? It seems that cameras are scattered around 0 in height dimension, so [-0.8, 0.5] might be too wide. Also, since the scene is always lower than all the cameras, you can try set the upper bound of scene height to be lower that the lowest camera.

wzpscott commented 1 year ago

Hi, I was able to successfully load the original poses from the mill19 dataset and conduct some tests. Unfortunately, the outcomes were not as good as those reported in GridNeRF. When using solely the grid branch on the Rubble dataset, the PSNR score was 20.06, and adding the NeRF branch only resulted in a minor improvement (21.19 PSNR). Could you please provide me with guidance on how to replicate the GridNeRF results?

wzpscott commented 1 year ago

Hi, I did a quick experiment on the Mill19 dataset using the uploaded ContextCapture pose.

The most important variables in the configuration file are set as follows:

lb = [-34, -50, -0.8] ub = [33, 49, 0.5] downsample_train = 8 train_near_far = [1e-1, 1000] render_near_far = [2e-1, 1000]

The training goes smoothly. However, I noticed a strange phenomenon,

[rubble-pixsfm-all][rubble] Iter 005210: psnr=19.78 test=7.11 mse=0.010: 10%|███████

The validation PSNR is much lower than the training PSNR.

However, when I added the validation images back to the training set, the validation PSNR is comparable with respect to the training PSNR.

I reserved the official validation images for validation:

000000.jpg 000083.jpg 000166.jpg 000249.jpg 000332.jpg 000415.jpg 000498.jpg 000581.jpg 000664.jpg 000747.jpg 000830.jpg 000913.jpg 000996.jpg 001079.jpg 001162.jpg 001245.jpg 001328.jpg 001411.jpg 001494.jpg 001577.jpg 001660.jpg

Would you please give me some suggestions? Thank you very much!

It is possible that the height dimension bounds you have set are too narrow, resulting in overfitting on the training set as the ground is not included. To address this, you could widen the z-axis range (e.g., [-30, 30]).

Liumouliu commented 1 year ago

Hi, I did a quick experiment on the Mill19 dataset using the uploaded ContextCapture pose. The most important variables in the configuration file are set as follows: lb = [-34, -50, -0.8] ub = [33, 49, 0.5] downsample_train = 8 train_near_far = [1e-1, 1000] render_near_far = [2e-1, 1000] The training goes smoothly. However, I noticed a strange phenomenon, [rubble-pixsfm-all][rubble] Iter 005210: psnr=19.78 test=7.11 mse=0.010: 10%|███████ The validation PSNR is much lower than the training PSNR. However, when I added the validation images back to the training set, the validation PSNR is comparable with respect to the training PSNR. I reserved the official validation images for validation: 000000.jpg 000083.jpg 000166.jpg 000249.jpg 000332.jpg 000415.jpg 000498.jpg 000581.jpg 000664.jpg 000747.jpg 000830.jpg 000913.jpg 000996.jpg 001079.jpg 001162.jpg 001245.jpg 001328.jpg 001411.jpg 001494.jpg 001577.jpg 001660.jpg Would you please give me some suggestions? Thank you very much!

It is possible that the height dimension bounds you have set are too narrow, resulting in overfitting on the training set as the ground is not included. To address this, you could widen the z-axis range (e.g., [-30, 30]).

Yes, thank you very much!

wecodeyes commented 1 year ago

Hi, we used ContextCapture to directly extract poses, which is different from MegaNeRF which uses PixSFM (with key points and bundle adjustment). We would still recommend to use their provided poses to align your experiment results. Nevertheless, here's a link to our extracted poses of the rubble scene: https://drive.google.com/file/d/15aQpVGDu7Lun9j4HJvpUZtnlTzEkUvQq/view?usp=sharing Alternatively you extract poses on your own with other software (not limited to ContextCapture).

It seem that the position columns of the rotation matrix are the same. May I ask if this is normal?

   "1314": {
    "path": "E:/rubble/train/rgbs/001331.jpg",
    "rot_mat": [
        [
            0.045136084254280166,
            -0.7099405584673029,
            0.7028137287655427,
            -12.6340647376608,
            3456.0
        ],
        [
            0.9989712379690988,
            0.028990488536018152,
            -0.03487143938710834,
            -7.36406876444701,
            4608.0
        ],
        [
            0.004381735806308757,
            0.703664660912837,
            0.7105187157096889,
            0.0657335308333105,
            2984.95154849617
        ]
    ]
},
"1315": {
    "path": "E:/rubble/train/rgbs/001332.jpg",
    "rot_mat": [
        [
            0.04663319548118687,
            -0.709640953119179,
            0.7030185365517165,
            -14.3329837574279,
            3456.0
        ],
        [
            0.9989073441420915,
            0.030961383288863525,
            -0.03500729300921302,
            -7.43981419185956,
            4608.0
        ],
        [
            0.003076182407826644,
            0.7038828811677013,
            0.7103093880140181,
            0.0648738259997898,
            2984.95154849617
        ]
    ]
},
"1316": {
    "path": "E:/rubble/train/rgbs/001333.jpg",
    "rot_mat": [
        [
            0.04929312637002642,
            -0.7095264731325632,
            0.702952609794384,
            -15.9739779301752,
            3456.0
        ],
        [
            0.998783917932857,
            0.03435894241843096,
            -0.035357437047669894,
            -7.50638625935151,
            4608.0
        ],
        [
            0.0009343293646288436,
            0.7038406403440718,
            0.7103572904029954,
            0.0772805686825144,
            2984.95154849617
        ]
    ]
},
"1317": {
    "path": "E:/rubble/train/rgbs/001334.jpg",
    "rot_mat": [
        [
            0.04842044678243421,
            -0.7100559533251308,
            0.7024784718985548,
            -17.6773318552157,
            3456.0
        ],
        [
            0.9988220323020153,
            0.03219387247824789,
            -0.03630567948581279,
            -7.58306946748191,
            4608.0
        ],
        [
            0.003163561515398933,
            0.7034089121715712,
            0.7107783720374052,
            0.0821205802823067,
            2984.95154849617
        ]
    ]
},
"1318": {
    "path": "E:/rubble/train/rgbs/001335.jpg",
    "rot_mat": [
        [
            0.049653677871439095,
            -0.7101714759485133,
            0.702275577692226,
            -19.3724312825489,
            3456.0
        ],
        [
            0.9987652602056648,
            0.03420030109531353,
            -0.036031852732010206,
            -7.68064562290361,
            4608.0
        ],
        [
            0.0015707578268917151,
            0.7031975640985256,
            0.7109927697000501,
            0.0795181434346097,
            2984.95154849617
        ]
    ]
},
"1319": {
    "path": "E:/rubble/train/rgbs/001336.jpg",
    "rot_mat": [
        [
            0.052259862181325424,
            -0.7098704530403884,
            0.7023908076740635,
            -21.0364818665215,
            3456.0
        ],
        [
            0.9986325381122817,
            0.03616226010281046,
            -0.037753738459649605,
            -7.73754315766592,
            4608.0
        ],
        [
            0.0014002243433872375,
            0.7034033201830162,
            0.7107895669797761,
            0.068339247130213,
            2984.95154849617
        ]
    ]
},
eveneveno commented 11 months ago

It seem that the position columns of the rotation matrix are the same. May I ask if this is normal?

   "1314": {
    "path": "E:/rubble/train/rgbs/001331.jpg",
    "rot_mat": [
        [
            0.045136084254280166,
            -0.7099405584673029,
            0.7028137287655427,
            -12.6340647376608,
            3456.0
        ],
        [
            0.9989712379690988,
            0.028990488536018152,
            -0.03487143938710834,
            -7.36406876444701,
            4608.0
        ],
        [
            0.004381735806308757,
            0.703664660912837,
            0.7105187157096889,
            0.0657335308333105,
            2984.95154849617
        ]
    ]
},
"1315": {
    "path": "E:/rubble/train/rgbs/001332.jpg",
    "rot_mat": [
        [
            0.04663319548118687,
            -0.709640953119179,
            0.7030185365517165,
            -14.3329837574279,
            3456.0
        ],
        [
            0.9989073441420915,
            0.030961383288863525,
            -0.03500729300921302,
            -7.43981419185956,
            4608.0
        ],
        [
            0.003076182407826644,
            0.7038828811677013,
            0.7103093880140181,
            0.0648738259997898,
            2984.95154849617
        ]
    ]
},
"1316": {
    "path": "E:/rubble/train/rgbs/001333.jpg",
    "rot_mat": [
        [
            0.04929312637002642,
            -0.7095264731325632,
            0.702952609794384,
            -15.9739779301752,
            3456.0
        ],
        [
            0.998783917932857,
            0.03435894241843096,
            -0.035357437047669894,
            -7.50638625935151,
            4608.0
        ],
        [
            0.0009343293646288436,
            0.7038406403440718,
            0.7103572904029954,
            0.0772805686825144,
            2984.95154849617
        ]
    ]
},
"1317": {
    "path": "E:/rubble/train/rgbs/001334.jpg",
    "rot_mat": [
        [
            0.04842044678243421,
            -0.7100559533251308,
            0.7024784718985548,
            -17.6773318552157,
            3456.0
        ],
        [
            0.9988220323020153,
            0.03219387247824789,
            -0.03630567948581279,
            -7.58306946748191,
            4608.0
        ],
        [
            0.003163561515398933,
            0.7034089121715712,
            0.7107783720374052,
            0.0821205802823067,
            2984.95154849617
        ]
    ]
},
"1318": {
    "path": "E:/rubble/train/rgbs/001335.jpg",
    "rot_mat": [
        [
            0.049653677871439095,
            -0.7101714759485133,
            0.702275577692226,
            -19.3724312825489,
            3456.0
        ],
        [
            0.9987652602056648,
            0.03420030109531353,
            -0.036031852732010206,
            -7.68064562290361,
            4608.0
        ],
        [
            0.0015707578268917151,
            0.7031975640985256,
            0.7109927697000501,
            0.0795181434346097,
            2984.95154849617
        ]
    ]
},
"1319": {
    "path": "E:/rubble/train/rgbs/001336.jpg",
    "rot_mat": [
        [
            0.052259862181325424,
            -0.7098704530403884,
            0.7023908076740635,
            -21.0364818665215,
            3456.0
        ],
        [
            0.9986325381122817,
            0.03616226010281046,
            -0.037753738459649605,
            -7.73754315766592,
            4608.0
        ],
        [
            0.0014002243433872375,
            0.7034033201830162,
            0.7107895669797761,
            0.068339247130213,
            2984.95154849617
        ]
    ]
},

Hi, in the 3x5 matrix shown here, the position column is indicated by the 4-th column, which are different among cameras. Are you looking at the last column which represents the hwf info here?

zhywanna commented 7 months ago

Hi, I did a quick experiment on the Mill19 dataset using the uploaded ContextCapture pose. The most important variables in the configuration file are set as follows: lb = [-34, -50, -0.8] ub = [33, 49, 0.5] downsample_train = 8 train_near_far = [1e-1, 1000] render_near_far = [2e-1, 1000] The training goes smoothly. However, I noticed a strange phenomenon, [rubble-pixsfm-all][rubble] Iter 005210: psnr=19.78 test=7.11 mse=0.010: 10%|███████ The validation PSNR is much lower than the training PSNR. However, when I added the validation images back to the training set, the validation PSNR is comparable with respect to the training PSNR. I reserved the official validation images for validation: 000000.jpg 000083.jpg 000166.jpg 000249.jpg 000332.jpg 000415.jpg 000498.jpg 000581.jpg 000664.jpg 000747.jpg 000830.jpg 000913.jpg 000996.jpg 001079.jpg 001162.jpg 001245.jpg 001328.jpg 001411.jpg 001494.jpg 001577.jpg 001660.jpg Would you please give me some suggestions? Thank you very much!

It is possible that the height dimension bounds you have set are too narrow, resulting in overfitting on the training set as the ground is not included. To address this, you could widen the z-axis range (e.g., [-30, 30]).

Yes, thank you very much!

@Liumouliu Could you please send me a copy of the transforms_train.json transforms_test.jsonandconfs.txt you applied on rubble dataset to reproduce? It would be really helpful for me. Thank you very much.

zhywanna commented 7 months ago

Hi, I did a quick experiment on the Mill19 dataset using the uploaded ContextCapture pose. The most important variables in the configuration file are set as follows: lb = [-34, -50, -0.8] ub = [33, 49, 0.5] downsample_train = 8 train_near_far = [1e-1, 1000] render_near_far = [2e-1, 1000] The training goes smoothly. However, I noticed a strange phenomenon, [rubble-pixsfm-all][rubble] Iter 005210: psnr=19.78 test=7.11 mse=0.010: 10%|███████ The validation PSNR is much lower than the training PSNR. However, when I added the validation images back to the training set, the validation PSNR is comparable with respect to the training PSNR. I reserved the official validation images for validation: 000000.jpg 000083.jpg 000166.jpg 000249.jpg 000332.jpg 000415.jpg 000498.jpg 000581.jpg 000664.jpg 000747.jpg 000830.jpg 000913.jpg 000996.jpg 001079.jpg 001162.jpg 001245.jpg 001328.jpg 001411.jpg 001494.jpg 001577.jpg 001660.jpg Would you please give me some suggestions? Thank you very much!

It is possible that the height dimension bounds you have set are too narrow, resulting in overfitting on the training set as the ground is not included. To address this, you could widen the z-axis range (e.g., [-30, 30]).

How to set appropriate dimension bounds?