Closed antithing closed 2 years ago
This issue is related to using the tagStandard41h12 dictionary. Switching to the 36h11 markers has improved my results across the board. They are faster, more robust and less prone to the flipping.
Thanks!
Hey @antithing, I faced the same issue with 36h11 today. Turns out if the tags are parellel to the the image plane, this issue occurs. Can you please try this out and let me know if you are having the same issue.
Also, do you think decimate will affect this flipping issue?
Thank you!
Hey @antithing, I faced the same issue with 36h11 today. Turns out if the tags are parellel to the the image plane, this issue occurs. Can you please try this out and let me know if you are having the same issue.
Also, do you think decimate will affect this flipping issue?
Thank you!
same issues happened in gazebo enviroment. I'm testing apriltag alongside with others. However, whenever the tags are parellel to the the image plane, estimated pose keep fliping heres are the error compared with ground turth provided by gazebo get_model_state_srv. with a 2mx2m tag, x-axis indicate distance in meter, y-axis indicate error in degree of three axis of estimate pose with ground turth. error grows to 30 degree above when flipping
This sounds like it might be a bug, there shouldn't be any noticeable difference in pose accuracy between 36h11 and Standard41h12 families. I will take a look at this.Can anyone on this thread provide images where this bug happens?
Also, is the problem that the detected corner locations are moving around or that the pose that is inferred from the corner locations is unreliable?
Thanks for your reply! Actullay I'm not sure it is a bug or not. It's more like an ambiguity problem. I place the model with those flipping pose by set_model_state srv in gazebo; here is what I got,
It's seems It's really hard to tell which pose is correct from camera view(left). gazebo client(right)
Any updates on this? Facing the same issue with the Standard41h12 family.
There are some conditions in which ambiguity is expected, basically if the object is far enough away from the camera and the resolution of the camera is low enough. It looks like the example posted on Jun 22 is low enough resolution that it will be fundamentally ambiguous and there is nothing our algorithm can do about it.
@lzyplayer Are the examples you gave rendered at the same resolution your camera is seeing them at? If so, these should not be ambiguous and I will investigate.
@Harsharma2308 Please post examples of your issue for me to investigate.
Hi @mkrogius , Thanks for the reply! Both examples provided are within gazebo simulation enviroment, in which a 2m x 2m tag is wanding around about 70 meters to camera. Camera has exactly the same parameter as Azure kinect RGB camera, that is
Example given on 22 Jun shows the error function's output. And example given on 27 Jun tells at which point it's flipping. The image on the left is exactly the image processed by Apriltag, while images on the right is screenshot from gazebo.
I notice the same issue with 16h5 family. It is an ambiguity issue, but I believe the ambiguity should be resolvable by the assumption that you should never be able to see the "back" of an april tag, and thus always get z-values which are towards the camera.
Actually, after looking into the problem more, I think the ambiguity is fundamental. Both solutions you see (when it flips, it's another solution of the perspective n point problem) are valid, and there's no way to resolve it other than more points, or 4 non-planar points.
Some information I have gathered about the planar pose ambiguity:
MarkerPoseTracker
to use temporal consistencyYes, it is a fundamentally ambiguous problem, if the tag's apparent size in the image is small enough. As you can see, the apriltag pose estimation code attempts to calculate both solutions and then return whichever solution is better. The pose estimation code is an implementation of Lu et. al. (2000) and the ambiguity code uses the method of Schweighofer and Pinz (2006). I chose this combination of methods because the Infinitesimal Plane-based Pose Estimation paper you linked states that this combination has the best performance when you have only 4 matched feature points, which is the case for fiducial detection (FYI, that paper refers to this combination of methods as RPP-SP).
Hi, and thank you for making this code available. I am using it in a live camera, and am seeing very heavy ambiguity pose flipping, so much so that it is unusable.
I am looking at implementing Temporal smoothing to fix this, but before I go down that route, is this normal? Or am I possibly doing something wrong?
Aruco markers with the same camera and calibration data are solid, Apriltags flip axes almost every frame.
Any tips greatly appreciated! Thanks.