QuantuMope / handeye-4dof

Handeye calibration for 4DOF manipulators using dual quaternions.
MIT License
23 stars 8 forks source link

Hand on base update #2

Closed sim-lab-dev closed 7 months ago

sim-lab-dev commented 2 years ago

Hi, Thanks for the greate work, and is there any update for hand on base

QuantuMope commented 2 years ago

Unfortunately, I do not have the free time currently to implement hand on base. I may get to it sometime in the future, but for now it is highly unlikely.

Implementing this should be relatively straightforward though. Much of the code for eye on hand can be used as a starting point and the only real things that change are the provided transforms. If you decide to implement this, please feel free to submit a pull request.

QuantuMope commented 2 years ago

Simply changing the transforms that are inputted won't be enough. The calibration code itself will have to be modified to account for hand on base.

kinoies commented 2 years ago

hello Sir, Thank you the code for 4Dof hand-eye calibration. I am do this work for calibration eye-to-hand based on this. However, I have met the problem of accuracy. I record some data of eye-to-hand and trans the format. Could you have time to investigate the code , please?

QuantuMope commented 2 years ago

Hi @kinoies,

Just to make sure, when you are saying eye-to-hand, do you mean eye-on-base calibration? Wanted to make sure since you commented on an issue about eye-on-base.

kinoies commented 2 years ago

Yes, it is eye on base, which mounts stationary camera.

I simply inputting my eye-to-hand transforms, and rewrite the non-linear optimization function based on current code. The transforms order are base-to-hand and marker-to-camera. I want solve camera-to-base transformation. Is it useful for eye on base? Could we complete this work together?

QuantuMope commented 1 year ago

Hi @kinoies,

I wouldn't even worry about the nonlinear optimization for now. Also, it's been a long time since I've looked at the code but if I remember correctly, computing eye-on-base calibration cannot be done by just replacing the input transforms and using the same code (otherwise I would have most likely implemented this). There are some other major updates that are required to get it working. You could try to take a stab at it and submit a pull request but firstly, I'd highly recommend reading the original paper here to get a better idea of what to do.

zaksel commented 12 months ago

Hi @QuantuMope thanks for this implementation on this complex subject.

I have read the paper and think i've got a basic understanding. In parallel i studied the code in calibrator.py and followed it along. To me it seems like there aren't any major changes needed for stationary cameras (eye-on-base). The calculation in the paper (which states they are assuming the stationary case) seems pretty similar to your implementation (which you say is only suited for eye-on-hand). I know it has been a while but maybe you can please explain the differences further?

I have tried to simply change the input and skip the nonlinear optimization but the calibration is always a bit off.

I would really appreciate any help because i tried to make some changes but am stuck and lost :) Kind regards

QuantuMope commented 12 months ago

Hi @zaksel, it's been quite a while since I've looked at this code, but from glancing at the paper again, I think you are correct. My code should (hopefully haha) work by just changing the transform inputs.

Also, @kinoies, this may answer your question as well. Sorry about the very late reply.

Currently for eye-in-hand, the TF chain looks like this:

         base_H_marker = base_H_tool * tool_H_cam * cam_H_marker
 unknown (don't care)      known      unknown (want)      known

The program expects a list of N base_H_tool and cam_H_marker transforms.

Now if we wanted to solve for eye-on-base, the TF chain should look like this (this is what is listed in the paper):

         tool_H_marker = tool_H_base * base_H_cam * cam_H_marker
 unknown (don't care)      known      unknown (want)      known

So then, to obtain base_H_cam, you would need to input a list of N tool_H_base and cam_H_marker transforms. Note that this is pretty much the same as the eye-in-hand inputs except we need to inverse one of the transforms.

Hope this helps. Let me know if this works or if this is something you've already tried.

zaksel commented 12 months ago

Hi, thank you very much for your fast and dedicated response!

Thats exactly what i thought should work. So i tried to feed the inverse of base_to_hand (base_H_tool⁻¹=tool_H_base) into the robot_pose_selector. The resulting transformation base_H_cam looks okay but is off by up to 2cm in reality. So i'm not entirely sure if it is an issue with my pose-pairs or with the calculation in this case.

I have to check again on Monday and will keep you posted

zaksel commented 11 months ago

Hi, it's me again...

I have run a few experiments and had to change a few things:

As part of my Problem i noticed we maybe have a very different understanding of naming conventions: When you are inputing camera_to_marker for cam_H_marker i would instead call that variable marker_to_camera because it is the transformation that converts a point from marker-coordinates (also the origin [0,0,0]) to the camera frame.

I will attach my Pose-Pairs here for anyone whos interested: base2gripper.txt target2cam.txt and also some images of the Marker-Poses (Pose 1, Pose 10 and Pose 20) as an example:

This results in the following Transformation [ -0.023620296269655228, 0.9987973570823669, -0.04296357184648514, -0.2916420102119446 ], [ 0.9990708231925964, 0.02203344739973545, -0.03704070299863815, 0.16479189693927765 ], [ -0.03604951873421669, -0.04379856213927269, -0.9983897805213928, 1.049223780632019 ], [ 0, 0, 0, 1 ] ]

@QuantuMope It is a great implementation of really hard math. I really appreciate it 👍

Edited: I renamed the file cam2target.txt to target2cam.txt because it contains transformations from marker to camera so the name was misleading.

QuantuMope commented 11 months ago

Awesome, thank you @zaksel

From your rough estimates, does the resultant transform seem reasonable to you?

The changes you had to do are interesting... I will try to investigate this issue soon so I can provide clear instructions in the README.

Thanks again!

zaksel commented 11 months ago

Transform seems reasonable and today we had the chance to verify our calibration. It still has errors in the range of ɛ < 2mm. But it is a starting point. So maybe I should try to get the non linear refinement working. I tried to simply enable it with flipped inputs and the error shoot up.

Do you have any other ideas on how to minimize errors?

AliBuildsAI commented 7 months ago

Hi, it's me again...

I have run a few experiments and had to change a few things:

  • Like above we take the inverse of the base_to_hand transformation
  • I switched the inputs to the pose_selector (This seems to be important, but i cannot really tell why)
  • I do not invert the resulting transformation
  • I skip nonlinear refinement

As part of my Problem i noticed we maybe have a very different understanding of naming conventions: When you are inputing camera_to_marker for cam_H_marker i would instead call that variable marker_to_camera because it is the transformation that converts a point from marker-coordinates (also the origin [0,0,0]) to the camera frame.

I will attach my Pose-Pairs here for anyone whos interested: base2gripper.txt cam2target.txt and also some images of the Marker-Poses (Pose 1, Pose 10 and Pose 20) as an example: This results in the following Transformation [ -0.023620296269655228, 0.9987973570823669, -0.04296357184648514, -0.2916420102119446 ], [ 0.9990708231925964, 0.02203344739973545, -0.03704070299863815, 0.16479189693927765 ], [ -0.03604951873421669, -0.04379856213927269, -0.9983897805213928, 1.049223780632019 ], [ 0, 0, 0, 1 ] ]

@QuantuMope It is a great implementation of really hard math. I really appreciate it 👍

Hi,

Could you please share your code? I tried to run your txt files with the following code, which I wrote using your explanations, and my final result is different than yours.

Here is the code:

import pickle
import numpy as np
from scipy.spatial.transform import Rotation as R
from handeye_4dof import Calibrator4DOF, robot_pose_selector

np.set_printoptions(suppress=True)

def read_internet_data(file_path):

    # Actual code to read from the provided file and convert each matrix to a numpy array
    matrices = []  # List to store each 4x4 matrix

    # Path to the uploaded file

    with open(file_path, 'r') as file:
        # Read file line by line
        lines = file.readlines()

        # Temporary list to hold the current 4x4 matrix
        current_matrix = []

        for line in lines:
            if line.strip().startswith('#') or not line.strip():
                # If we encounter a comment or empty line, and current_matrix has data, save the matrix
                if current_matrix:
                    matrices.append(np.array(current_matrix))
                    current_matrix = []  # Reset for the next matrix
            else:
                # Convert the current line to a list of floats and append to the current matrix
                row = list(map(float, line.split(',')))
                current_matrix.append(row)

        # Add the last matrix if the file doesn't end with a comment
        if current_matrix:
            matrices.append(np.array(current_matrix))

    return matrices

def main():

    base_to_hand = read_internet_data('/home/ali/Downloads/base2gripper.txt')
    #base_to_hand = [np.linalg.pinv(b2h) for b2h in base_to_hand]
    camera_to_marker = read_internet_data('/home/ali/Downloads/cam2target.txt')
    # Obtain optimal motions as dual quaternions.
    motions = robot_pose_selector(base_to_hand, camera_to_marker)

    # Initialize calibrator with precomputed motions.
    cb = Calibrator4DOF(motions)

    # Our camera and end effector z-axes are antiparallel so we apply a 180deg x-axis rotation.
    dq_x = cb.calibrate(antiparallel_screw_axes=True)

    # Hand to Camera TF obtained from handeye calibration.
    #ca_hand_to_camera = np.linalg.inv(dq_x.as_transform())
    ca_hand_to_camera = dq_x.as_transform()

    ca_rotation = np.rad2deg(R.from_matrix(ca_hand_to_camera[:3, :3]).as_euler('xyz'))

    np.set_printoptions(precision=5)
    print("Hand to Camera Transform Comparisons")
    print("Translations: Calibration  {}".format(ca_hand_to_camera[:3, -1]))
    print("Rotations:    Calibration  {}".format(ca_rotation))
    print(ca_hand_to_camera)

if __name__ == '__main__':
    main()

And here is the final printed pose:

Rotations:    Calibration  [179.8117    0.26353  69.99275]
[[ 0.34214  0.93965  0.00151 -0.63202]
 [ 0.93964 -0.34212 -0.00545  0.16605]
 [-0.0046   0.00329 -0.99998  0.37867]
 [ 0.       0.       0.       1.     ]]

Note: I have checked that the result of read_internet_data() has the length of 31, and it reads the poses correctly.

zaksel commented 7 months ago

Oh, you are right. I made a mistake in the files i uploaded. cam2target does not contain Transformations from camera to the marker but vice versa. So it should be named target2cam.txt (I did rename the file in my comment above) and you have to invert those transformations before feeding them to the Calibrator4DOF.

In my case i also had to increase the sv_limit to 0.6 cb = Calibrator4DOF(motions, sv_limit=0.6)

And the Translation in z-Direction is calculated from depth values and overwritten so it is different from the result you will get.

AliBuildsAI commented 7 months ago

Thank you @zaksel After adding this line camera_to_marker = [np.linalg.pinv(t2c) for t2c in camera_to_marker] I get the following result, which is aligned with yours, except T_z. (I also had to increase sv_limit)

Rotations:    Calibration  [-177.48809    2.06593   91.35435]
[[-0.02362  0.9988  -0.04296 -0.29164]
 [ 0.99907  0.02203 -0.03704  0.16479]
 [-0.03605 -0.0438  -0.99839 -1.11151]
 [ 0.       0.       0.       1.     ]]

Could you please elaborate on what you did for T_z? I am also using a RGBD camera so I can try to replicate it for myself.

zaksel commented 7 months ago

@AliBuildsAI if you are using an RGBD-Camera or get the distance between camera and markers from your detection (e.g. cv2.aruco.estimatePoseCharucoBoard). You can save that information and calculate a distance between robot-base-frame and camera-frame with h = marker_t[2] - TCP_LENGTH - np.abs(robot_t[2]) I do this for every pose and average those then i override T_z by this value. TCP_LENGTH is the fixed offset between Robot-Flange and Calibration-Plate. If you have this offset depends on your setup. Simply measure this value.

Hope that was understandable ;)

This is working for me BUT the authors are proposing a different approach in the Paper see Section III. E. Determination of the Real tz

QuantuMope commented 7 months ago

My apologies for letting this issue go dead again back in November @zaksel. I was finishing up my dissertation that month so it was quite a hectic time for me.

Thank you to both @AliBuildsAI and @zaksel for this discussion ^_^.

The fact that you both are getting a reasonable calibration result means that the framework in its current state is suitable for eye-on-base calibration, but the current need for switching the inputs to the pose selector made me a bit wary. So I decided to get to the bottom of why that is and this is what I found.

@zaksel is correct that my naming convention was incorrect in my above explanation. My above explanation became a bit unclear as I forgot the meaning for the notation that the original paper uses. In that paper, they denote a transformation b_H_a as being one that converts a point from coordinate system a to coordinate system b. When I wrote the code, I personally disliked this convention so I wrote my transforms in the opposite order (i.e., a_T_b). Maybe not the greatest idea considering it led to this confusion 😅.

Now using the left-to-right notation instead, I will attempt to explain what the current caliborator example is doing as well as the example that you both have done for eye-on-base.

Currently, for eye-in-hand, the code requires a set of base_T_hand and camera_T_marker transforms. This allows us to compute the TF chain below:

camera_T_marker * hand_T_camera * base_T_hand = base_T_marker
   (known)       (unknown, want)     (known)   (unknown, don't care)

and through this, we obtain the desired TF hand_T_camera.

Now, we can go back to eye-on-base. For this, using the left-to-right notation, the TF chain is shown below:

base_T_hand * camera_T_base * marker_T_camera = marker_T_hand
   (known)   (unknown, want)     (known)   (unknown, don't care)

and through this, we obtain the desired TF camera_T_base.

This now explains why 1) switching the poses is necessary, and 2) why we need marker_T_camera rather than camera_T_marker

Anyways, let me know if this checks out with the both of you. I appreciate both of your efforts in getting to the bottom of this. I incorrectly assumed that more work would be needed in achieving eye-on-base functionality, but fortunately it seems that functionality can be achieved with a simple change in inputs! 🥳🥳🥳

@zaksel if you are willing, a PR would be greatly appreciated in explaining eye-on-base functionality in the README.md as well as perhaps refactoring variables names found throughout the code that are specific to eye-in-hand to something more general. It would be a way for your contributions to also be reflected in this repo. If you're too busy though, I can handle this when I have some free time, just lmk ^_^