Ambiguity flipping - Githubissues

antithing commented 4 years ago

Hi, and thank you for making this code available. I am using it in a live camera, and am seeing very heavy ambiguity pose flipping, so much so that it is unusable.

I am looking at implementing Temporal smoothing to fix this, but before I go down that route, is this normal? Or am I possibly doing something wrong?

Aruco markers with the same camera and calibration data are solid, Apriltags flip axes almost every frame.

Any tips greatly appreciated! Thanks.

antithing commented 4 years ago

This issue is related to using the tagStandard41h12 dictionary. Switching to the 36h11 markers has improved my results across the board. They are faster, more robust and less prone to the flipping.

Thanks!

suraj2596 commented 4 years ago

Hey @antithing, I faced the same issue with 36h11 today. Turns out if the tags are parellel to the the image plane, this issue occurs. Can you please try this out and let me know if you are having the same issue.

Also, do you think decimate will affect this flipping issue?

Thank you!

lzyplayer commented 4 years ago

Hey @antithing, I faced the same issue with 36h11 today. Turns out if the tags are parellel to the the image plane, this issue occurs. Can you please try this out and let me know if you are having the same issue.

Also, do you think decimate will affect this flipping issue?

Thank you!

same issues happened in gazebo enviroment. I'm testing apriltag alongside with others. However, whenever the tags are parellel to the the image plane, estimated pose keep fliping heres are the error compared with ground turth provided by gazebo get_model_state_srv. with a 2mx2m tag, x-axis indicate distance in meter, y-axis indicate error in degree of three axis of estimate pose with ground turth. error grows to 30 degree above when flipping

mkrogius commented 4 years ago

This sounds like it might be a bug, there shouldn't be any noticeable difference in pose accuracy between 36h11 and Standard41h12 families. I will take a look at this.Can anyone on this thread provide images where this bug happens?

Also, is the problem that the detected corner locations are moving around or that the pose that is inferred from the corner locations is unreliable?

lzyplayer commented 4 years ago

Thanks for your reply! Actullay I'm not sure it is a bug or not. It's more like an ambiguity problem. I place the model with those flipping pose by set_model_state srv in gazebo; here is what I got,

It's seems It's really hard to tell which pose is correct from camera view(left). gazebo client(right)

Harsharma2308 commented 4 years ago

Any updates on this? Facing the same issue with the Standard41h12 family.

mkrogius commented 4 years ago

There are some conditions in which ambiguity is expected, basically if the object is far enough away from the camera and the resolution of the camera is low enough. It looks like the example posted on Jun 22 is low enough resolution that it will be fundamentally ambiguous and there is nothing our algorithm can do about it.

@lzyplayer Are the examples you gave rendered at the same resolution your camera is seeing them at? If so, these should not be ambiguous and I will investigate.

@Harsharma2308 Please post examples of your issue for me to investigate.

lzyplayer commented 4 years ago

Hi @mkrogius , Thanks for the reply! Both examples provided are within gazebo simulation enviroment, in which a 2m x 2m tag is wanding around about 70 meters to camera. Camera has exactly the same parameter as Azure kinect RGB camera, that is

resolution: 1280x720
FOV: 75°x65°
FPS: 30
tag_size: 2 meters
distance from tag to camera: about 70 meters

Example given on 22 Jun shows the error function's output. And example given on 27 Jun tells at which point it's flipping. The image on the left is exactly the image processed by Apriltag, while images on the right is screenshot from gazebo.

maxschommer commented 3 years ago

I notice the same issue with 16h5 family. It is an ambiguity issue, but I believe the ambiguity should be resolvable by the assumption that you should never be able to see the "back" of an april tag, and thus always get z-values which are towards the camera.

maxschommer commented 3 years ago

Actually, after looking into the problem more, I think the ambiguity is fundamental. Both solutions you see (when it flips, it's another solution of the perspective n point problem) are valid, and there's no way to resolve it other than more points, or 4 non-planar points.

s-trinh commented 3 years ago

Some information I have gathered about the planar pose ambiguity:

this (quite old) paper presents a method to return I think the two plausible solutions: Iterative Pose Estimation Using Coplanar Feature Points, Denis Oberkampf, Daniel F.DeMenthon, Larry S.Davis, 1995:

Infinitesimal Plane-based Pose Estimation, Toby Collins, Adrien Bartoli, 2014 that computes also the two possible solutions and suitable for tag pose estimation
Sensor Fusion for Fiducial Tags: Highly Robust Pose Estimation from Single Frame RGBD, Pengju Jin, Pyry Matikainen, Siddhartha S. Srinivasa, 2017, the following image from the paper shows the ambiguity problem in certain viewpoint and with imprecise corners coordinates extraction

also documented in the ArUco FAQ where they recommend to use MarkerPoseTracker to use temporal consistency
this paper I just found: Resolving Marker Pose Ambiguity by Robust Rotation Averaging with Clique Constraints, Shin-Fang Ch'ng, Naoya Sogi, Pulak Purkait, Tat-Jun Chin, Kazuhiro Fukui, 2019

mkrogius commented 3 years ago

Yes, it is a fundamentally ambiguous problem, if the tag's apparent size in the image is small enough. As you can see, the apriltag pose estimation code attempts to calculate both solutions and then return whichever solution is better. The pose estimation code is an implementation of Lu et. al. (2000) and the ambiguity code uses the method of Schweighofer and Pinz (2006). I chose this combination of methods because the Infinitesimal Plane-based Pose Estimation paper you linked states that this combination has the best performance when you have only 4 matched feature points, which is the case for fiducial detection (FYI, that paper refers to this combination of methods as RPP-SP).

AprilRobotics / apriltag

Ambiguity flipping #71