Closed iperper closed 1 year ago
If an image is constrained by few correspondences and if your camera model is over-parametrized (e.g. OpenCV model for an image without any distortion), then the pose refinement (simple least-squares optimization of reprojection errors) can overfit the model to better explain any noise in the correspondences (keypoint noise or mild outliers). This is less likely to occur for a camera model that is less expressive (e.g. SIMPLE_RADIAL). COLMAP actually refines the extra params only if no image of the same camera was registered before (link). COLMAP however always refines the extra params in the BA (link), so the estimate improves as it triangulates more points or adds more images of the same camera.
These are all very help insights, thanks!
My goal in using OPENCV model was to express the different fx and fy (rather than trying to model any specific distortion), but since they are close (<1%), I can go for a simpler camera model.
I will be registering multiple images, and can try both of your recommendations.
I am trying to better understand the impact of
refine_extra_params
during pose estimation.I have set up a pipeline to localize images against a custom dataset. During localization, I create an
OPENCV
camera model because I have differentfx
andfy
parameters in my camera intrinsics. However, I do not know the distortion parameters. Thus, I create a camera as follows:camera = pycolmap.Camera("OPENCV", 640, 480, [fx, fy, cx, cy, 0, 0, 0, 0])
For localization, I create a query localizer and perform localization.
I set the
refine_focal_length=False
because I'm fairly confident on the focal length, butrefine_extra_params=True
since I don't have any prior on the distortion coefficients (although I do have prior on the cx, cy).However, this leads to very poor localization results (can be off by 100s of meters). If I set
refine_extra_params=False
, the localization results become acceptable.My questions are:
refine_extra_params
? Under what scenarios would the refinement cause issues, and when should it work better (e.g. diverse viewpoints in the retrieval images, etc.)?