Implement OpenCV's Hand-Eye Calibration for Comparison

Kazadhum commented 3 months ago

Hello! I'm trying to "ATOM-ize" OpenCV's Hand-Eye calibration so we can compare our results with theirs. This issue is related to #891.

I've seen people implement OpenCV's calibrateHandEye method [1] [2] but I don't think this is the same as our method and, as such, is not comparable.

In this method, the problem is formulated as $AX = XB$, and the only transformation estimated is the transformation between the camera and the gripper.

I think what we want to use is the calibrateRobotWorldHandEye() method, where the $AX=ZB$ problem is addressed.

I'm currently trying to get from ATOM's dataset the necessary transformations to perform the calibrations (right now I'm going to try getting the transform from the camera to the calibration pattern).

Kazadhum commented 3 months ago

Hi @miguelriemoliveira and @manuelgitgomes ! I want to get the transformation from the camera to the pattern based on the detections in a given collection. Before I go dive into how to do this I wanted to ask if this is already done in ATOM somewhere?

Kazadhum commented 3 months ago

I think I found an example of it in the patterns.py module.

https://github.com/lardemua/atom/blob/e740af957dbb9dff580e08e1da1cacf3579a813b/atom_calibration/src/atom_calibration/collect/patterns.py#L74-L77

I think this is done by the solvePnP() function.

Kazadhum commented 3 months ago

Hello @miguelriemoliveira and @manuelgitgomes ! I think I got some progress done, but not a lot.

I'm still only considering one collection instead of the whole dataset, to make testing easier. I can now calculate the transformation from the pattern to the camera here:

https://github.com/lardemua/atom/blob/61c4845c26178ba161c043e0b12a611cd4d51f48/atom_evaluation/scripts/other_calibrations/cv_handeye_calib.py#L70-L107

Admittedly, this code is a work in progress and the variable names aren't the best they could be. Some things are also hardcoded for now for testing reasons. I'm unsure how to check whether the values this calculates are correct. How should I test this?

Additionally, assuming the values in the matrix are correct, calling the cv2.calibrateRobotWorldHandEye() returns an error:

Traceback (most recent call last):
  File "./cv_handeye_calib.py", line 209, in <module>
    main()
  File "./cv_handeye_calib.py", line 205, in main
    cvHandEyeCalibrate(objp=objp, dataset=dataset, camera=camera, pattern=pattern, number_of_corners=number_of_corners)
  File "./cv_handeye_calib.py", line 126, in cvHandEyeCalibrate
    cv2.calibrateRobotWorldHandEye(pattern_R_cam, pattern_t_cam, base_R_gripper, base_t_gripper)
cv2.error: OpenCV(4.6.0) /io/opencv/modules/core/src/matmul.dispatch.cpp:363: error: (-215:Assertion failed) a_size.width == len in function 'gemm'

To replicate, run the command:

clear && ./cv_handeye_calib.py -json ~/atom_datasets/riwmpbot_real/merged/dataset_train.json -c rgb_world -p hand_pattern

Maybe we should meet sometime next week, since I have some doubts about how to go about coding this. When are you available?

miguelriemoliveira commented 3 months ago

Hi @Kazadhum ,

I am looking into this. Using this opencv page as reference:

https://docs.opencv.org/4.x/d9/d0c/group__calib3d.html#ga41b1a8dd70eae371eba707d101729c36

So the transformations are:

cTw (world to camera) -> this is given by a solve pnp of with the a pattern detector. This is the B in equation (1) of our paper to iros.

NOTE in the iros paper we use the camera to world, not sure if this is just semantics or the inverse transform is used. For example the cTw, we read from left to right, as in world to camera, but in opencv they read inversely, i.e., camera to world.

gTb (base to gripper) -> this is given by the robot's direct kinematics. This is the A in our iros paper.

So I guess what we need to do in a script is to translate our atom datasets to these lists of matrices (vector<[Mat]) and call the opencv function.

miguelriemoliveira commented 3 months ago

I am using this (may be helpful to replicate)

rosrun atom_evaluation cv_handeye_calib.py -json $ATOM_DATASETS/riwbot/train/dataset.json -c rgb_world -p pattern_1

This is using the riwbot atom example.

miguelriemoliveira commented 3 months ago

I confirm the error above...

miguelriemoliveira commented 3 months ago

So the error is because you are giving the function numpy arrays, where it needs a list of numpy arrays (at least 3, says the documentation).

I will try to refactor the function to create these lists ...

miguelriemoliveira commented 3 months ago

Fixed it. now it does not crash. Not sure if its returning correct results though. @Kazadhum let me know if you need more help ...

Kazadhum commented 3 months ago

I'm picking this back up...

Kazadhum commented 3 months ago

Now that this is running and there's no errors, one thing I don't understand is that we use as inputs 3 collections and we get two transformations as output:

Base to World;
Gripper to Camera;

The thing is, what collection does this output refer to? Base to World shoud be static, I think, but the Gripper to Camera transform should change with different collections, no?

Maybe I'm missing something obvious...

miguelriemoliveira commented 3 months ago

Hi @Kazadhum ,

use riwbot as the example.

Here those transformations are both static, check https://github.com/lardemua/atom/tree/noetic-devel/atom_examples/riwbot#configuring-the-calibration

The camera does not move w.r.t. to the gripper. Its also static.

miguelriemoliveira commented 3 months ago

If you want we can talk a bit later ...

Kazadhum commented 3 months ago

Hello @miguelriemoliveira! I understand now, but maybe we should still talk... What time is good for you?

miguelriemoliveira commented 3 months ago

18h?

Kazadhum commented 3 months ago

Sounds good!

Kazadhum commented 3 months ago

Now adding a function to save the calibrated dataset, based on what was done for the ICP Calibration:

https://github.com/lardemua/atom/blob/73066893937065e2d79832197b18366924c45049/atom_evaluation/scripts/other_calibrations/icp_calibration.py#L83-L99

Kazadhum commented 2 months ago

I created a new dataset for rihbot, which I named train_diogo. I had to do this for testing because the original dataset for rihbot only had one collection with a complete detection (a minimum of 3 are needed). This new one has 12 collections and all of them have every corner detected.

@miguelriemoliveira I can't sync the rihbot dataset folder with your NAS right now since I'm at the lab, but once I'm home I'll sync it.

miguelriemoliveira commented 2 months ago

@miguelriemoliveira I can't sync the rihbot dataset folder with your NAS right now since I'm at the lab, but once I'm home I'll sync it.

Why not?

Kazadhum commented 2 months ago

I think it's something to do with eduroam but I'm not sure

miguelriemoliveira commented 2 months ago

Hum ... @manuelgitgomes do you have the same problem?

manuelgitgomes commented 2 months ago

Yes I do @miguelriemoliveira

miguelriemoliveira commented 2 months ago

Damn. Stic doing their usual policing ... not sure how to move around this ...

Kazadhum commented 2 months ago

This implementation is almost done, but I ran into a problem and need to do some debugging (I think I'm calculating the TFs wrong).

The coordinates of the points gotten through calibration, when projected to the sensor image, are outside the image. So, the tfs seem to be poorly calculated. I'll do some more thorough testing after lunch.

Kazadhum commented 2 months ago

Starting tests now...

Currently checking if the issue is in the openCV calibration by comparing the results TFs to ATOM's... (I doubt this is where the error is, but it's a start)

Kazadhum commented 2 months ago

Ok so the error is definitely here...

So what I did was I wrote a quick script to get the aggregate transformations from ATOM's calibrated dataset to compare with the transformations calibrated by OpenCV's method. Here's the script:

import numpy as np
import cv2
import argparse
import json
import os
import tf

from colorama import Style, Fore
from collections import OrderedDict
from matplotlib import cm

from atom_evaluation.utilities import atomicTfFromCalibration
from atom_core.atom import getTransform
from atom_core.dataset_io import saveAtomDataset
from atom_core.geometry import traslationRodriguesToTransform
from atom_core.vision import projectToCamera
from atom_core.drawing import drawCross2D, drawSquare2D

def main():
    # Read dataset file
    json_file = "/home/diogo/atom_datasets/rihbot/train_diogo/atom_calibration.json"
    f = open(json_file, 'r')
    dataset = json.load(f)

    for collection_key, collection in dataset['collections'].items():

        if collection_key == '000':
            print('Collection 000')

            tfs = collection['transforms']
            calib_tf_base2pattern = getTransform('base_link', 'pattern_link', tfs)

            print(calib_tf_base2pattern)

    f.close()

if __name__ == '__main__':
    main()

BTW, I'm using the rihbot system for tests, specifically, the train_diogo dataset.

Now for the comparison...

From the OpenCV calibration procedure, we get the transformation from the base_link to the pattern:

[[-4.18732837e-03 -8.47085912e-01  5.31439483e-01 -1.12597089e+02]
 [-3.52986306e-02  5.31238153e-01  8.46486876e-01  1.50165419e+02]
 [-9.99368037e-01 -1.52145675e-02 -3.21254413e-02  2.94955834e+02]
 [ 0.00000000e+00  0.00000000e+00  0.00000000e+00  1.00000000e+00]]

From the ATOM calibration...

[[ 0.02278512 -0.99953713 -0.02015854  0.29099556]
 [-0.99972229 -0.02290139  0.00555597 -0.17599925]
 [-0.00601506  0.02002634 -0.99978136  0.01296016]
 [ 0.          0.          0.          1.        ]]

I'll try to figure out what's wrong... Probably an error in the implementation of the OpenCV calibration?

Kazadhum commented 2 months ago

Found my first problem! It doesn't fix it completely, but cv2.solvePnP() should receive the distorted points, since it handles the distortion on its own. No image undistortion is needed.

Kazadhum commented 2 months ago

Just checked and the same issue happens with the transformation from the gripper to the camera's optical frame

Kazadhum commented 2 months ago

Ok so it seems like solvePnP() is working properly. I compared the TF's which solvePnP() outputs and the TF's from the post-ATOM calibration dataset and it looks right...

For collection 0:

From solvePnP() (optical frame to pattern):

[[ 0.98812305 -0.15354179 -0.00614442 -0.16397377]
 [ 0.15076996  0.97645823 -0.15426519 -0.18974208]
 [ 0.02968592  0.1515066   0.98801037  1.00460372]
 [ 0.          0.          0.          1.        ]]

From ATOM's calibration (optical frame to pattern):

[[ 0.98812612 -0.15350927 -0.00645574 -0.21405645]
 [ 0.15068653  0.97644963 -0.15440109 -0.25734529]
 [ 0.03000571  0.15159495  0.98798716  0.99341401]
 [ 0.          0.          0.          1.        ]]

So the difference seems rather small, which I take to mean this bit is working correctly.

Kazadhum commented 2 months ago

I've located the problem to the cv2.calibrateRobotWorldHandEye()

miguelriemoliveira commented 2 months ago

Hi @Kazadhum can you confirm how you run the hand eye script?

miguelriemoliveira commented 2 months ago

I suppose this is it:

rosrun atom_evaluation cv_handeye_calib.py -json $ATOM_DATASETS/rihbot/train_diogo/dataset.json -p pattern_1 -c rgb_hand

you confirm?

Kazadhum commented 2 months ago

Hello @miguelriemoliveira! I just got a new laptop and I'm still setting it up so I can't confirm it, but I believe I didn't use rosrun and just ran it as a script. It should work the same though. I think the arguments are all correct. Just to check, I think I did:

./cv_handeye_calib.py -c rgb_hand -p pattern_1 -json $ATOM_DATASETS/rihbot/train_diogo/dataset.json

miguelriemoliveira commented 2 months ago

better my way because you don't have to be in the directory of the script.

miguelriemoliveira commented 1 month ago

Hi @Kazadhum ,

finally I was able to make this work. This was a challenge. Now its working. Some details:

It runs for all 4 opencv supported methods, i.e. 'tsai', 'park', 'horaud', 'andreff', 'daniilidis'], which means we can have four alternative methods :-). Check the --method_name flag
It converts the estimated hand_T_camera transformation to the one we are actually searching for as per atom configuration, i.e. calibration_parent_T_calibration_child. Tested and seems to be working.
It compares against ground truth, if -cgtg flag is used

It saves a json dataset with the calibration, e.g. hand_eye_daniilidis_rgb_hand.json

miguelriemoliveira commented 1 month ago

Here's how to use

clear && rosrun atom_evaluation cv_eye_in_hand.py -json $ATOM_DATASETS/rihbot/train_test_opencv/dataset.json -p pattern_1 -c rgb_hand -hl flange -bl base_link -mn daniilidis -ctgt

miguelriemoliveira commented 1 month ago

@Kazadhum , I am done with this. Take a look and see if you agree.

Things for you:

Close this issue if you confirm all is working fine.
Delete your old code in the cv_hand_eye_calib.py
Create a eye-to-base script using data from riwbot to test (probably in another issue).

Kazadhum commented 1 month ago

Ok, thank you @miguelriemoliveira! I will take a look at the code on the train to aveiro and give you feedback.

On Monday I'll create the eye-to-base script and keep working on #939

Kazadhum commented 1 month ago

Hi @miguelriemoliveira! I took a look at the script and everything seems alright to me (much cleaner than what I had written). I also tested it and it worked just fine. So I think I can close this issue.

I already found somethings that I'm going to have to change in the eye-to-hand script, so we're definitely going to have two scripts, like we discussed. Thank you for your help! I'm now going to start working on the eye-to-hand case, based on this one (on #943). I'm also going to delete my old script.

Kazadhum commented 1 month ago

Reopening to fix the issue found in #943...

In the eye-in-hand case, the estimated h_T_c is almost the same as the ground truth, but the result of the comparison seems exagerated.

Running: rosrun atom_evaluation cv_eye_in_hand.py -c rgb_hand -p pattern_1 -hl flange -bl base_link -json $ATOM_DATASETS/rihbot/train_test_opencv/dataset.json -ctgt

returns:

Ground Truth h_T_c=
[[ 0.     0.     1.    -0.02 ]
 [-1.    -0.     0.     0.   ]
 [ 0.    -1.     0.     0.065]
 [ 0.     0.     0.     1.   ]]
estimated h_T_c=
[[ 0.    -0.     1.    -0.021]
 [-1.    -0.001  0.    -0.001]
 [ 0.001 -1.    -0.     0.065]
 [ 0.     0.     0.     1.   ]]
Etrans = 0.182 (m)
Erot = 0.021 (deg)
+----------------------+-------------+---------+----------+-------------+------------+
|      Transform       | Description | Et0 [m] |  Et [m]  | Rrot0 [rad] | Erot [rad] |
+----------------------+-------------+---------+----------+-------------+------------+
| flange-rgb_hand_link |   rgb_hand  |   0.0   | 0.000762 |     0.0     |  0.000365  |
+----------------------+-------------+---------+----------+-------------+------------+

The Etrans looks like it's a lot higher than it should be, judging from the matrices...

Kazadhum commented 1 month ago

In:

https://github.com/lardemua/atom/blob/144c6f0fba5ddaec81668ed79bb3e2a157a84e73/atom_evaluation/scripts/other_calibrations/cv_eye_in_hand.py#L341

isn't this value supposed to be in mm, since it comes from the comparison between tfs and then is multiplied by 1000?

If so, then these errors make sense

Kazadhum commented 1 month ago

Hi @miguelriemoliveira! If you agree with this change, this can be closed again, because the eye-in-hand case is working okay I think.

miguelriemoliveira commented 1 month ago

So the problem from yesterday is not visible in the hand in eye ... that's good news I think.

Kazadhum commented 1 week ago

Hello @miguelriemoliveira and @manuelgitgomes!

In my last e-mail, I said the implementation of noise in the OpenCV calibration scripts was one of my next tasks. Today, I tested its implementation and registered that the results with noise and without noise were the exact same. This also happened for larger values of noise.

Having noted this, I looked at the code and reached the conclusion that the implementation of noise as is done in ATOM doesn't make sense for the OpenCV methods.

Firstly, ATOM's optimization is an iterative method, whereas OpenCV doesn't perform an optimization. The noise we add to a dataset to test ATOM's calibration is added to the initial guess. Considering the difference in methods, I think there isn't really a good way to introduce noise to the OpenCV calibration in an appropriate way to compare with ATOM.

Secondly, there's how OpenCV's method computes the $A$ and $B$ matrices. One of these is calculated through the solvePnP() function, so the noise doesn't change its value;

https://github.com/lardemua/atom/blob/d95480fda0fde72ac2a1a923596bb01dfec9f31c/atom_evaluation/scripts/other_calibrations/cv_eye_to_hand.py#L232-L246

The other is calculated by forward kinematics. However, for the eye-to-hand case, the transformation chain from the hand link to the base link isn't affected by the noise:

https://github.com/lardemua/atom/blob/d95480fda0fde72ac2a1a923596bb01dfec9f31c/atom_evaluation/scripts/other_calibrations/cv_eye_to_hand.py#L282-L289

With this in mind, I don't really think it makes sense to implement this. Do you agree?

lardemua / atom

Implement OpenCV's Hand-Eye Calibration for Comparison #912