mprib / caliscope

Multicamera Calibration + Pose Estimation --> Open Source Motion Capture
https://mprib.github.io/caliscope/
BSD 2-Clause "Simplified" License
152 stars 10 forks source link

Extract calibration points from video automatically #584

Closed rgov closed 6 months ago

rgov commented 6 months ago

I have to press "Add Grid" repeatedly on frames where my CharuCo grid is in the image. Can't this be done automatically?

mprib commented 6 months ago

I see your point. One possible though more computationally intensive approach is to use RANSAC. Still, everything is going to be limited by the quality of the input data, by which I mean the breadth of board positions relative to the camera and how much the board shows foreshortening.

Trying to keep things more tractable and simple, the user could autopopulate a sample of grids based on a target grid count and a threshold for how many corners have to be tracked (say 40 boards need to be selected but only if 80% of the board is in view). These boards could be selected evenly in time from across the recording. The functionality for adding a single board would then be a way to backfill this automated approach with select board views.

Does this address the concern? Do you have other thoughts on approaches to take?

Thank you for the feedback!

mprib commented 6 months ago

First Commit:

Initial autopopulation capabilities implemented; Checklist of changes still to make:

mprib commented 6 months ago

@rgov ,

I'm about to merge back in this autopopulate capability:

https://youtu.be/b-NgQqZDyjA

The GUI is getting a bit cluttered at this point, but I'm holding off on trying to make things pretty until they stabilize.

I appreciate the feedback. Things that seem obvious in retrospect are hard to see without an extra set of eyes. If you have any other thoughts (or find some bug/crashing with the new behavior), please just let me know.

rgov commented 6 months ago

Thanks for the quick turnaround! I ended up writing my own stereo calibration script for my data set in which I did automatic CharuCo corner extraction from all the frames. I did note a few things:

1) If all the corners are colinear, there is an inscrutable assertion failure in OpenCV. I filed the bug at https://github.com/opencv/opencv/issues/24676 and in the comments have a workaround for rejecting colinear corners.

2) The intrinsic calibration for one of my cameras appeared to work but then always led to a very bad extrinsic calibration. I am not sure the root cause ... I ended up working around it by simply copying the other camera's intrinsics (since they ought to be very similar).

The `stereoCalibrateExtended()` function, if you're using it, can return the per-input error so it could possibly be used to reject frames that are outliers from the others... Or possibly you would do the calibration yourself using more sophisticated methods that work in multicamera setups.
mprib commented 6 months ago

@rgov ,

I hope that pyxy3d can still be some use for your project even after rolling your own solution. If there is some specific workflow that would be valuable, just let me know and I can reflect on how straightforward it might be to implement.

Regarding (1), does that happen if you have multiple frames going into the calibration and only one is collinear? It makes sense to me that changing the IDs but not the Corners would impact it. I'm guessing that under the hood it is looking for collinearity of the corners in a board frame-of-reference, so if you can grab an ID from one row up then it passes some internal check based on the board definition.

Regarding (2), I'm curious: were you verifying your intrinsic calibration visually? Early on when I started messing with this stuff I realized how easy it was to get a low RMSE on an intrinsic calibration, but it could still be horribly overfit to where the board samples happen to be drawn. If the stereocalibration tries to build off of that then it can get painted into a corner.

Just for context, pyxy3d uses stereocalibrate only to construct an initial estimate of where the cameras are relative to each other, so it doesn't need to be super precise. This initial estimate feeds into the bundle adjustment process where it is "tightened up" by a SciPy optimization to minimize the reprojection error across all cameras. If you're curious, this is the core of it:

https://github.com/mprib/pyxy3d/blob/62ba20308e54e8c2b23668854a3db46201947fb1/pyxy3d/calibration/capture_volume/capture_volume.py#L90C11-L90C11

Apologies if I'm explaining things that you already know!