Closed Tangshitao closed 2 years ago
Hi,Should the camera parameters of different scenes be processed in advance and placed in the center of the normalized coordinate system, and then trained to obtain the ideal mesh. Finally, multiply them by the corresponding scale to recover the original scale. Is that right?
@DongyangHuLi Yes, that's correct.
@DongyangHuLi Yes, that's correct.
Hi, Could you tell me where in the code the corresponding scale is multiplied to restore the original scale?
@DongyangHuLi Yes, that's correct.
Hi, Could you tell me where in the code the corresponding scale is multiplied to restore the original scale?
And how to adjust the camera intrinsic for pictures with unequal height and width?
When I modified the camera pose according to the https://github.com/autonomousvision/monosdf/blob/main/preprocess/scannet_to_monosdf.py#L76, I couldn't reconstruct the correct result. I was wondering why changing the camera position didn't affect the results in your work.
@DongyangHuLi
You can use this line to convert back to original scale https://github.com/autonomousvision/monosdf/blob/main/dtu_eval/evaluate_single_scene.py#L100-L101.
You can refer to here if you crop an image https://github.com/autonomousvision/monosdf/blob/main/code/datasets/scene_dataset.py#L181-L201 and current code supports unequal height and width.
Which scenes are testing? How do you modify it?
@niujinshuchong Hi,Thank you for your patience. I think I understand. If I convert the coordinate system to NDC, I think it will have the same effect,right?
@DongyangHuLi We haven't try it on NDC space. Maybe you need to convert it back to Euclidean space if you want to use the loss.
@niujinshuchong Thanks for your patient reply! I have converted it back to Euclidean space to use the loss. I used your code in https://github.com/autonomousvision/monosdf/blob/f4393ef3a2639a580890c88bc1e6959cfca84803/code/utils/plots.py#L48 to generate my mesh, but the result seems terrible. Actually, I have some simple questions: The manipulation of images, camera poses and intrinsics in the code is to convert the scene to a normalized coordinate system to generate the mesh. If I want to generate a mesh for a scene, do I have to place the scene in a normalized coordinate system? I'm a little confused about this data preprocessing process. Can we not normalize or use some other normalization method? In my understanding, each image should correspond to the camera pose, and why after 'world_mat @ scale_mat' It can also generate the right rays? After I do the same with my data, it doesn't seem to work anymore. Thank you again for your reply!
@DongyangHuLi
The workflow is: your original data -> normalised camera poses -> used for training and extract mesh -> convert back to original space.
The scale_mat is used to transform points in the normalised space to your original data pose. And the world_mat is the projection matrix which is K @ w2c_transform. Here w2c_transform is in your original data space.
@niujinshuchong I see. Thank you very much!By the way,doesn't depth need to be converted accordingly?
@DongyangHuLi We use monocular depths which are up to scale and we compute the scale with least square formula on the fly during training. But if you want to use metric depth, you need to convert depth accordingly.
@DongyangHuLi We use monocular depths which are up to scale and we compute the scale with least square formula on the fly during training. But if you want to use metric depth, you need to convert depth accordingly.
ok,thank you very much!
@niujinshuchong Sorry, excuse me for asking one more question. If I want to use metric depth, how do I convert depth accordingly? : )
@DongyangHuLi You just need to multiply with the scale.
@niujinshuchong Sorry, I seem to have made a silly mistake. : ) Thank you very much.
Hi, if you normalise the scene with something like here: https://github.com/autonomousvision/monosdf/blob/main/preprocess/scannet_to_monosdf.py#L76, you can simply set the scene_bounding_sphere to 1.0 or 1.1.