facebookresearch / MCC

Multiview Compressive Coding for 3D Reconstruction
Other
631 stars 48 forks source link

Camera intrinsics #12

Open stalkerrush opened 1 year ago

stalkerrush commented 1 year ago

Hi,

Since the paper performs unprojection of RGB-D images to pointclouds, I assume intrinsics are required during the inference. But it seems this project does not rely on that information. Is the underlying assumption here being orthogonal camera?

Thank you!

chaoyuaw commented 1 year ago

Hi @stalkerrush , That's a great question! In cases where camera intrinsics are available, e.g. the iPhone capture examples, we can just use them for unprojection. In cases where we don't have the intrinsics, e.g. the DALLE 2 and ImageNet examples, we just assume some default camera, which is likely not the true intrinsics, so the unprojection is possibly suboptimal. However, we see that MCC still reconstructs the overall shapes reasonably. Knowing the intrinsics could potentially improve results further.

stalkerrush commented 1 year ago

Thanks for the clarification, @chaoyuaw ! Another question I had is that, it seems the decoder can be treated as a conditional implicit occupancy function. In this case, I am wondering whether one can directly run marching cubes to obtain the mesh reconstruction. In another thread, I saw that the visualizations are done by rendering the pointcloud.