How to calculate scene_bounding_sphere for customized scenes?

autonomousvision / monosdf

[NeurIPS'22] MonoSDF: Exploring Monocular Geometric Cues for Neural Implicit Surface Reconstruction

MIT License

573 stars 53 forks source link

How to calculate scene_bounding_sphere for customized scenes? #34

Closed Tangshitao closed 2 years ago

niujinshuchong commented 2 years ago

Hi, if you normalise the scene with something like here: https://github.com/autonomousvision/monosdf/blob/main/preprocess/scannet_to_monosdf.py#L76, you can simply set the scene_bounding_sphere to 1.0 or 1.1.

DongyangHuLi commented 2 years ago

Hi，Should the camera parameters of different scenes be processed in advance and placed in the center of the normalized coordinate system, and then trained to obtain the ideal mesh. Finally, multiply them by the corresponding scale to recover the original scale. Is that right?

niujinshuchong commented 2 years ago

@DongyangHuLi Yes, that's correct.

DongyangHuLi commented 2 years ago

@DongyangHuLi Yes, that's correct.

Hi, Could you tell me where in the code the corresponding scale is multiplied to restore the original scale?

DongyangHuLi commented 2 years ago

@DongyangHuLi Yes, that's correct.

Hi, Could you tell me where in the code the corresponding scale is multiplied to restore the original scale?

And how to adjust the camera intrinsic for pictures with unequal height and width?

DongyangHuLi commented 2 years ago

When I modified the camera pose according to the https://github.com/autonomousvision/monosdf/blob/main/preprocess/scannet_to_monosdf.py#L76, I couldn't reconstruct the correct result. I was wondering why changing the camera position didn't affect the results in your work.

niujinshuchong commented 2 years ago

@DongyangHuLi

You can use this line to convert back to original scale https://github.com/autonomousvision/monosdf/blob/main/dtu_eval/evaluate_single_scene.py#L100-L101.

You can refer to here if you crop an image https://github.com/autonomousvision/monosdf/blob/main/code/datasets/scene_dataset.py#L181-L201 and current code supports unequal height and width.

Which scenes are testing? How do you modify it?

DongyangHuLi commented 2 years ago

@niujinshuchong Hi，Thank you for your patience. I think I understand. If I convert the coordinate system to NDC, I think it will have the same effect，right?

niujinshuchong commented 2 years ago

@DongyangHuLi We haven't try it on NDC space. Maybe you need to convert it back to Euclidean space if you want to use the loss.

DongyangHuLi commented 2 years ago

@niujinshuchong Thanks for your patient reply! I have converted it back to Euclidean space to use the loss. I used your code in https://github.com/autonomousvision/monosdf/blob/f4393ef3a2639a580890c88bc1e6959cfca84803/code/utils/plots.py#L48 to generate my mesh, but the result seems terrible. Actually, I have some simple questions: The manipulation of images, camera poses and intrinsics in the code is to convert the scene to a normalized coordinate system to generate the mesh. If I want to generate a mesh for a scene, do I have to place the scene in a normalized coordinate system? I'm a little confused about this data preprocessing process. Can we not normalize or use some other normalization method? In my understanding, each image should correspond to the camera pose, and why after 'world_mat @ scale_mat' It can also generate the right rays? After I do the same with my data, it doesn't seem to work anymore. Thank you again for your reply!

niujinshuchong commented 2 years ago

@DongyangHuLi

The workflow is: your original data -> normalised camera poses -> used for training and extract mesh -> convert back to original space.

The scale_mat is used to transform points in the normalised space to your original data pose. And the world_mat is the projection matrix which is K @ w2c_transform. Here w2c_transform is in your original data space.

DongyangHuLi commented 2 years ago

@niujinshuchong I see. Thank you very much！By the way，doesn't depth need to be converted accordingly?

niujinshuchong commented 2 years ago

@DongyangHuLi We use monocular depths which are up to scale and we compute the scale with least square formula on the fly during training. But if you want to use metric depth, you need to convert depth accordingly.

DongyangHuLi commented 2 years ago

@DongyangHuLi We use monocular depths which are up to scale and we compute the scale with least square formula on the fly during training. But if you want to use metric depth, you need to convert depth accordingly.

ok，thank you very much！

DongyangHuLi commented 2 years ago

@niujinshuchong Sorry, excuse me for asking one more question. If I want to use metric depth, how do I convert depth accordingly? : )

niujinshuchong commented 2 years ago

@DongyangHuLi You just need to multiply with the scale.

DongyangHuLi commented 2 years ago

@niujinshuchong Sorry, I seem to have made a silly mistake. : ) Thank you very much.