facebookresearch / co3d

Tooling for the Common Objects In 3D dataset.
Other
963 stars 75 forks source link

Camera focal length for the test set different from dev set #52

Closed zhizdev closed 2 years ago

zhizdev commented 2 years ago

Thanks for the great work on CO3Dv2 and also answering so many questions on Github!

I recently noticed that the focal length of the 1st camera (the novel view we want to render) within the fewview_test subset is drastically different from the rest of the context cameras i.e. (1.8 vs 3.7).

Meanwhile, within the fewview_dev subset, the focal length of the 1st camera and the rest of the context cameras are roughly in the same range i.e. (2.6 vs 2.8).

What is the underlying mechanism behind this phenomenon?

Looking at the sample submission code with dbir, I see that we render a cropped image and then paste the crop onto the original image with paste_render_to_original_image. Therefore, when I run python example_co3d_challenge_submission.py on fewview_dev , it first renders the cropped 800x800 image. This behavior appears to be inline with the focal lengths for fewview_dev subset, but not the fewvew_test subset.

Thanks!

davnov134 commented 2 years ago

Hi, the focal lengths within a scene are different because we let COLMAP to extract an image-specific (as opposed to a camera-specific) focal length. The main reason is that the consumer smartphones used to capture the videos often automatically change focus during the capture, which effectively leads to a varying focal length of the video frames.

In terms of example_co3d_challenge_submission.py, we did several unit tests that ensure that the focal length is correctly adjusted when cropping the image around a segmentation mask. The algorithm does not handle samples from different sets (fewview_dev/fewvew_test) differently so there should not be an issue.

Please note that, for the test subset (i.e. fewvew_test/manyview_test), all depth maps and all test images are redacted (i.e. these are blank black images). Hence, example_co3d_challenge_submission.py produces non-sensical results for fewvew_test/manyview_test since the DBIR renderer generates an invalid point cloud from the redacted depth maps of the source views.

zhizdev commented 2 years ago

Hi David, thanks for the response! I really appreciate it!

It took a while for me to look at the example_co3d_challenge_submission.py and see how the query image is rendered and pasted onto the original image.

Some notes in case it may be helpful for future readers: 1) When we are loading pytorch3d cameras and cropped images with the training data loader, the cameras intrinsic parameters are adjusted according to the object bounding box detected from the foreground mask. 2) When we load the fewview_dev/fewvew_test sets, the query image is not given; thus, we are given unadjusted intrinsic parameters. This may result in different focal length than we are used to seeing for views with box crop.