3D calibration - Githubissues

osnr commented 1 month ago

aka real-world calibration, aka high-accuracy calibration.

As a reminder, the point of this 3D calibration project is

to track programs in 3D, so you can lift them off the table and they'll projection-map properly, but also
to build much more accurate tracking in general – millimeter precision – to enable new applications where you can highlight individual words or facets of objects, and
to track coordinates in real-world meters instead of arbitrary projector or camera pixels, so we can start to integrate multiple projectors, cameras, and other real-world sensors and actuators (phone localization, RFID, CNC machine bed and material, spatial audio…), which will all benefit from having a shared ground coordinate system

This pull request replaces the old calibrate.tcl with a new virtual-programs/calibrate/ subsystem. You will need to recalibrate your system with the 3D calibration process for it to continue working under this branch.

3D calibration uses a simplified version of the technique in Audet (2009) / ProCamCalib to estimate the camera intrinsics (intrinsic matrix and k1 and k2 distortion), projector intrinsics, and extrinsic translation (in meters) and rotation between the camera and projector.

46D49D08-2863-4374-95B3-CC3706DD0635-1279-00000331320C08B3

0ACE7F06-A756-4AE6-A5F4-721D04F5BE52-1279-000003313B39103D

You print out a checkerboard of AprilTags and put it under your camera & projector; we assume the checkerboard is locally linear and fill gaps in the checkerboard with projected AprilTags; we use that grid of printed+projected AprilTags as a pose, then you wave the board and we accumulate a collection of those poses do the calibration (mostly according to Zhang (1998), where you take a collection of model->pose homographies and turn that into a full 3D calibration for an individual camera or projector).

Quads

With those intrinsic and extrinsic parameters calibrated, each detected AprilTag will now produce a quad (Claim 37 has quad $q)

A quad is like the 3D equivalent of a region. (but more constrained, always a quad, has clearly specified top/left/right/bottom, tagged with an explicit coordinate space, to reflect the way in which we've actually been using regions in practice) The quad design is still tentative.

A quad consists of four assumed-coplanar points. A quad may be in tag space, camera space, or projector space (and can be converted between these). There are some operations like quad scale and quad offset -- you can offset a quad in tag space to go from tag quad -> entire-page quad, for example (since you're going along the tag plane).

This displays the distance between programs 13 and 19 on my table (notice how I have to transform into shared camera-space before getting the distance; annoying but it works for now):

When 19 has quad /a/ & 13 has quad /b/ {
  set a [quad change $a "/dev/video4"]
  set b [quad change $b "/dev/video4"]
  Wish to draw text with x 800 y 400 text [format {%.2f} [* 100 [norm [sub [quad right $a] [quad left $b]]]]]cm radians 3.14
}

Right now, quads are automatically transformed to old-style 2D regions (in projector space) as well, so we should have backward compatibility with most or all programs. But we should break that soon (as we build more quad-native, 3D-capable libraries for stuff like pointing, labels, outlines, to replace the old region libraries).

Other changes

setup.folk.default has been added.

You can copy this to ~/folk-live/setup.folk and manually set the camera and projector to be used by the system here (so Folk can now run on, e.g., laptops, without being locked into the built-in laptop screen and webcam, or can use HDMI1 instead of HDMI0 on the Pi gadget). You can also set desired camera resolution here (so you can opt into 720p, 1080p, 4K on a system-local basis).

(All Folk systems will either have their own setup.folk or will fall back to this default one, which should have the same behavior that was hard-coded into Folk before -- display 0 and camera 0.)
- fbset dependency has been removed; we get the display default resolution directly from Vulkan
- Camera statements have all been changed to mention which camera, hopefully laying groundwork for future multi-camera: /someone/ claims the camera frame is /grayFrame/ at /timestamp/ -> /someone/ claims camera /camera/ has frame /grayFrame/ at timestamp /timestamp/
AprilTag statements have changed from tag $id has center and tag $id has corners statements to tag $id has detection /det/ on /camera/ at /timestamp/ statement
You can draw AprilTags with Wish to draw an AprilTag with id ... corners ... (virtual-programs/display/display-apriltags.folk). Probably useful in a lot of random areas where you want a feedback loop (latency testing, calibration refinement, viruses)
The Web endpoint /frame-image/ is now included in virtual-programs and renamed to /camera-frame
Hold! has been added -- it's a simplified version of Commit that is less magic, only updates a single statement, doesn't take anything from lexical scope, it's used to speed up some internal evaluator/sharing operations (Commit itself is now implemented in terms of Hold!): this should provide a moderate global performance boost. unclear if users should be using it as well but they could (#52)
C code can return 2D arrays (and use them in structs; this is useful for transmitting homographies and points and stuff C<->Tcl) & can use arguments called r
Timing info simplified (may be sort of broken)
/any/ is now a wildcard (like /something/ and /anything/)

Could do

could do:

slider/default projection in middle instead of corner of table
video/images on /calibrate of how to calibrate (what does it mean to fit projected into printed, how do you move around, etc)
refine each pose homography to try to reduce error further (OpenCV and Audet both do this)
~better instructions? tell people that they can recalibrate if bad, tell people that they should look at how program outlines actually look to judge calibration goodness~
improve the quad/region interface, figure out the right operations, make it ergonomic to do stuff with
~explicit guidance (display warning to user?) for uncalibrated/migrating-from-old-calibration systems~
~reintroduce mask-tags~
~allow user to specify printed program tag mm~

Feedback and contributions welcome here.

What I need

I've tested this on folk0, folk-convivial, & folk-live in my apartment.

Please test this on your system if you have time. Let me know what quality of calibration you get (the most legitimate end-to-end test is the visible accuracy of program outlines, I think). To calibrate, go to http://folk-WHATEVER.local:4273/calibrate and follow the instructions.

It's not perfect -- you may also want to re-try calibrating a few times until you get a satisfactory calibration -- but it should be at least comparable in accuracy to the old 2D calibration and in some cases much better (so between 1mm and 1cm in most cases; also moderately faster). I'd like to merge this if we can consistently meet that standard & people are consistently able to calibrate their systems.

l3gacyb3ta commented 1 month ago

aka probably the biggest cool new thing in folk in a while!

l3gacyb3ta commented 1 month ago

I can confirm that calibrating is pretty easy. Hardest part was finding a thing to tape my page to. I'll look through all the code and stuff in a bit :)

osnr commented 1 month ago

There's the very weird issue where different tags actually have different levels of accuracy (like 1047 will work a lot better than 642). Need to hunt down.

osnr commented 1 month ago

There's the very weird issue where different tags actually have different levels of accuracy (like 1047 will work a lot better than 642). Need to hunt down.

OK, I figured out that tags that are actually 32mm (instead of 30mm) get way thrown off in pose detection. So you really need to be millimeter-accurate when you specify your printed tag size in setup.folk (just added this today, default inner tag size is 30mm).

And we probably need a way of specifying per-program geometry overrides (if you print some program at larger or smaller size than your default printout).

osnr commented 1 month ago

Just a reminder that it would be great if people could test/review this :-) I'm hoping to merge it in the next couple days.

I think with these most recent commits, all the major issues are resolved.

(it's a breaking change, so you'll probably end up needing to go through it anyway at some point if not now)

cwervo commented 1 month ago

I just tried this at home on folk-cwe the flow for previewing the camera, printing, and then calibrating is great! One note:

ps2pdf wasn't installed which blocked /calibrate from loading. I didn't see it in this PR as an explicit dependency (it comes from ghostscript), we should add it to this PR before merging, yeah?

Notes:

The preview process is great! Can be iterated on but I think this is a really great set of tools for helping make this process as easy as possible.
I'm an even big advocate for using clipboards, now, it made making the board a snap (+ some washi tape):

osnr commented 1 month ago

@cwervo and others -- how are we feeling about this?

I would like to do some more passes to attempt to improve accuracy and UX on folk0, folk-convivial, and folk-recurse this week. some things we could do:

force the projected tags to the corners of the display for each pose to try to enforce coverage (maybe you can manually offset size/location if the corners are off-table)
use some more specific formulation of visual servoing (whole square, instead of point-by-point?) to improve tag estimation stability
change default projected tag size to make fit quicker
fix animation example (use quads, not regions)
add demonstration video/gif to instructions page to show people to lift it up in 3D and that they can 'skip' whole integer tag-sizes to move the tags
something with autoexposure?

and maybe wait for some more test results from other people to come in also.

What else is holding this up?

osnr commented 1 month ago

also: ~fix mask-tags~, ~fix camera slices which seem glitchy~, fix animation

cwervo commented 1 month ago

Feeling generally good about this! I do think fixing animation (e.g. making it so that region automatically have quads associated with them) would be good so that older programs don't break / people can "think in 2D" if they're only doing transforms on the planes around programs.

force the projected tags to the corners of the display for each pose to try to enforce coverage (maybe you can manually offset size/location if the corners are off-table)

I like this idea, I think it's worth trying out on folk0 to see how it feels but I agree with your hunch that this would get people to cover more of the volume

use some more specific formulation of visual servoing (whole square, instead of point-by-point?) to improve tag estimation stability

This would improve stability during the calibration phase, you mean?

change default projected tag size to make fit quicker

My comment in Discord on this was something like 1/4 as big as they are right now feels good, I still agree this would be good

add demonstration video/gif to instructions page to show people to lift it up in 3D and that they can 'skip' whole integer tag-sizes to move the tags

I've been feeling sick today so haven't been at Hex House today but will be in tomorrow and can use my filming stuff to record a proper video and edit the video to illustrate these steps! I'll put this up on the wiki. Will give me a good opportunity to revisit/archive our auto-calibration and manual calibration pages

In general more test data would be good but based on my experience:

calibrating here at home (folk-cwe)
seeing Jessie 3D calibrate the folk-recurse system

I think we know the process work at least as well as the current calibration process on most sized systems.

l3gacyb3ta commented 1 month ago

It's more of an API thing, but a nicer way to get physical measurements would be nice :)

FolkComputer / folk

3D calibration #162

Quads

Other changes

Could do

What I need

Notes: