dkogan / mrcal

Next-generation camera-modeling toolkit
http://mrcal.secretsauce.net
Apache License 2.0
196 stars 16 forks source link

Camera-projector calibration #25

Open zumpchke opened 2 months ago

zumpchke commented 2 months ago

I am trying to accomplish a fairly niche task, but overlaps with computer vision/camera calibration. I have a IR camera (kinect) (cam1) mounted on top of a projector (cam2). The idea is to be able to re-project the scene in the projector and have it accurately align to the viewer. The applications for this are augmented reality/tracking/etc.

I have made good progress, but likely missing a piece of the puzzle which is why I am asking for help.

My steps are

  1. Calibrate the IR camera independently using a Charuco board. Results in ~0.3 RMS err
  2. Calibrate projector independently by aligning projected points on the Charuco board. Results in ~0.5 RMS err
  3. Get a separate dataset which records the same board on each of IR camera and projector. Solve for extrinsics by --skip-intrinsics-solve and use the previous cameramodels as input to --seed. Results in ~0.5 RMS err.

I then transform an input (distorted) IR image of the current scene to the PINHOLE variant of the projector/cam2 cameramodel, NOT the actual one (because the projector will then "distort" it again)

When I transform cam1 to cam2-pinhole, and then reproject the 1920x1080 image to the projector [it's native resolution], it doesn't line up. What gives me hope though, is that the "shapes" look correct. I can manually scale x,y and offset x,y and get them to align pretty well. So the distortion seems to be removed. This is apparent in the below image where the projected cardboard at least seems to be the same shape as the real cardboard.

IMG_7708

dkogan commented 2 months ago

I'm not following the full, complex sequence, but on a high-level, it seems like it should mostly work. If you are able, I would simplify: do you really need the 3 separate steps? Why can't you do it in one step?

If the opencv chessboard detector sucks (it does!), then don't use it. mrgingham works great. Or boofcv.

Past that, you just need to debug. After each step, pick an arbitrary point, and manually transform and project it through your calibration to see if it makes sense. If there's a bug somewhere (there probably is), this will find it.

zumpchke commented 2 months ago

Thanks. I will try to simplify the sequence to try and find any bugs.

The corner detections I think are okay given the poor resolution of the IR camera:

kinect

Past that, you just need to debug. After each step, pick an arbitrary point, and manually transform and project it through your calibration to see if it makes sense. If there's a bug somewhere (there probably is), this will find it.

Just to confirm, can I use mrcal-reproject-points to do this? My plan would be to take a 2D IR camera pixel point (i.e, corner of the cardboard), and reproject it as a circle positioned at projector pixel coordinates using the pinhole projector model. Ideally it should align to the same spot. Does this take into account all intrinsics and extrinsics?

zumpchke commented 2 months ago

I tried mrcal-reproject-points but got this warning

## WARNING: /usr/bin/mrcal-reproject-points ignores relative translations, which were non-zero here. t_to_from = [-0.02537009  0.16908851 -0.05502302]

How can I compensate for this? Would multiplying the X,Y by the width/height of cam2 suffice?

dkogan commented 2 months ago

If you have a low-res camera, bring the object it's looking at closer to make things appear bigger. The way you're doing it won't make anything fail catastrophically, but your accuracy will suffer.

When debugging, you want to trust nothing (since you don't know what's broken), and use as simple and as core things as you can. In this case, don't use mrcal-reproject-points, but do stuff manually. Probably you can trust the mrcal project/unproject/transform functions, so stick to those. Do the reprojection manually. For instance, if you have two representations of the same camera (same intrinsics), and a rotation between them, but no translation, you can reproject:

q1 = mrcal.project( mrcal.rotate_point_R(R10, mrcal.unproject(q0, *model0.intrinsics())), *model1.intrinsics())

And then you can compare the q1 you get this way with a q1 you know (from looking at your image, or something). At each step, find some correspondence you can validate like this. If you have a transform, then you need a distance also, and the rotate_point_R becomes transform_point_Rt.

zumpchke commented 2 months ago

WIth the manual point re-projection, I can get pretty close. I selected this point

image

And then reprojected

IMG_7710

Aiming for more accuracy though, maybe a better calibration will work.

Regarding image transform maps, I noticed that if you don't specify distance it doesn't take into account translations which could be my problem. How do I set distance correctly so that this works?

dkogan commented 2 months ago

I don't know what I'm supposed to do with those images. But don't show them to me; this is for your own sanity checking, since you have a logic bug somewhere.

As for the distance, think about what you are trying to do. Draw a diagram of your two cameras, and draw a ray in space that represents one of your pixels, and the point in space that corresponds to it and the reprojection to the other camera. Does the reprojection depend on distance? What if there's no translation?

If you are trying to reproject a known-geometry object from one camera to another, you can use that object geometry to compute the distances. In the special case of a plane in space, whose pose you know, you can use mrcal-reproject-image --plane-n ... --plane-d ....

zumpchke commented 2 months ago

Ah I think I see where I am getting confused wrt distance.

if I want to accurately reproject a given camera pixel coordinate to hit it with the projector, I think that projector pixel coordinate will depend on the given distance/depth and I'm not giving that. I only have estimates of that when passing in checkerboards at calibration time.

However, the IR cam is a depth sensor and I could sample THAT depth coordinate and that's what most solutions do. I was hoping not to do that because it is noisy but it seems like it's the only solution here.

A simple test of this theory would be to get the world coordinate of a chessboard corner using estimatePoseCharucoBoard(), and then use intrinsics and extrinsics to get the projector pixel coordinate from there.

zumpchke commented 2 months ago

I think my original plan wouldve worked if I was just working on a flat plane.

dkogan commented 2 months ago

Hi. Sorry, I don't have the cycles right now to really think about this. But it sounds like you're on the right track, and hopefully will figure it out. Good luck!