hsp-iit / project-ergocub

A demonstration of human-robot interaction on the ergoCub through vision.
1 stars 0 forks source link

Design the gaze control task #68

Closed xEnVrE closed 1 year ago

xEnVrE commented 1 year ago

We need to design a proper gaze control task.

The main requirement is to define what $dx$ should be in

$$ J \dot{q} = dx $$

First proposal

We assume to have the instantaneous z axis of the realsense $\hat{l}(q)$ and the desired pointing direction

$$ d(q) = p^d - o(q) $$

where $p^d$ is the desired gaze point and $o(q)$ is the instantaneous origin of the RealSense camera frame.

In this case, the displacement vector might be defined as:

$$ dx = k \hat{l}(q) \times d(q) $$

Moreover, we should only the three downmost rows of the Jacobian, i.e. those associated to the angular velocity part of the end-effector twist.

We are not sure if this works as when the two directions are aligned, i.e. where the error is zero, the cross product is not well defined.

Second proposal

Note: this requires more work in term of implementation.

We could extend the kinematic chain with an additional fake prismatic joint to be superimposed on the gaze point $p^{d}$.

In this case, given the translational part of the Jacobian of the chan, i.e. the three topmost rows, $J(q)^{p}$, we should append an additional column containing the instantaneous z axis of the realsense $\hat{l}(q)$.

In that case, the displacement vector reduces to

$$ dx = k (p^d - p(q)) $$

where $p(q)$ is the instantaneous position of the end-effector taking into account the prismatic joint.

To implement this solution we would need to "fake" the motion of the prismatic joint in software using an "integrator".

We should understand if it is possible to add a "fake" link to the model using iDynTree APIs so that the evaluation of the augmented Jacobian and of the forward kinematics come for free.

Woolfrey commented 1 year ago

Here is the stability proof. $\mathbf{l}\in\mathbb{R}^3$ is a vector denoting the line of sight, and $\mathbf{p}\in\mathbb{R}^3$ is a point we want the robot to look at:

image

Define the error as $$\boldsymbol{\varepsilon} = \mathbf{l\times p}$$ Then the time derivative is:

\begin{align}
\dot{\boldsymbol{\varepsilon}} &= \mathbf{\dot{l}\times p} \\ 
&= \left(\boldsymbol{\omega}\times\mathbf{l}\right)\times\mathbf{p}
\end{align}

Define the angular velocity for the head to be: $$\boldsymbol{\omega} \triangleq k\cdot\boldsymbol{\varepsilon}$$ for some value $k\in\mathbb{R}^+$. Then the time derivative of the error becomes:

\begin{align}
\dot{\boldsymbol{\varepsilon}} &= k\cdot\left(\boldsymbol{\varepsilon}\times\mathbf{l}\right)\times\mathbf{p} \\
&= -k\cdot\boldsymbol{\varepsilon}
\end{align}

Therefore: $$\boldsymbol{\varepsilon}(t) = e^{-kt}\boldsymbol{\varepsilon}_0$$ and is asymptotically stable.

andrearosasco commented 1 year ago

Alright, I've implemented the first solution and it looks like it's working. I'll commit and add more details after lunch. @Woolfrey @xEnVrE

Woolfrey commented 1 year ago

You might want to scale the task vector $\boldsymbol{\omega}$ if it is too big. Once you set a new point for the robot to look at, the velocity will jump from zero to some new value instantaneously. This will cause the head to move rapidly which could cause problems on the physical robot. A good method might be to look at the linear velocity $$\mathbf{v} = \boldsymbol{\omega}\times\mathbf{l}.$$ You can scale $\mathbf{v}$ so that the camera will move in a straight line towards the target point $\mathbf{p}$ without being too fast.

andrearosasco commented 1 year ago

Setting point 10 cm right of the camera

https://user-images.githubusercontent.com/47559809/229800518-4212c8e7-1d84-4a0f-b889-d788440fd49a.mp4

Going back to 0 https://user-images.githubusercontent.com/47559809/229800931-daecf916-8fd3-4c27-97bc-0c5550df2328.mp4

Setting point 20 cm right of the camera

https://user-images.githubusercontent.com/47559809/229801330-6fbbc686-4c6f-4d5f-811b-be15f4c5f50d.mp4

And back to 0 again

https://user-images.githubusercontent.com/47559809/229801840-d2573baa-50e6-412d-8ea4-5ae39b591bf3.mp4

As you can see when a point is set, the head tilts correctly from the starting position. For some reason (maybe related to the problem @Woolfrey pointed out), when we ask the robot to go back, giving it a new point something weird happens.

Not sure why the videos are not displayed inside the issue

Woolfrey commented 1 year ago

You should put spaces between the urls so that the videos load in the comment...

Woolfrey commented 1 year ago

Two thoughts here:

  1. The relationship $\delta\mathbf{x} = \mathbf{J(q)}\cdot\delta\mathbf{q}$ is only valid for infinitesimally small values.
  2. The proper relationship for our problem is to define an instantaneous velocity $\boldsymbol{\omega} = \mathbf{J{\omega}(q)\dot{q}}$. Then this needs to be converted to a joint position to send to the motors: $\mathbf{q}{i+1} = \mathbf{q}_i + \Delta t\cdot\mathbf{\dot{q}}_i$.

If you assume point 1 and $\delta\mathbf{x}$ is very large, then this linear relationship is no longer valid. This would explain the behaviour in the second video, but not the first...

andrearosasco commented 1 year ago

Made some video: up-down

https://user-images.githubusercontent.com/47559809/230102906-c0331c1c-0ff8-4ad8-ad12-16ec35c1eb1b.mp4

left-rigth

https://user-images.githubusercontent.com/47559809/230103362-a4d9269d-c6b6-41a3-b045-b429f888ce72.mp4

circle

https://user-images.githubusercontent.com/47559809/230103461-f7352bb3-57c2-4c02-9085-a069fa21e653.mp4

Now I'm not penalizing the roll in any way and as you can see, in the last video it goes in that very unnatural configuration. I'll try to see if playing with W and the redundant task I can fix that but yesterday we didn't manage.

andrearosasco commented 1 year ago

I increased the speed of oscillation and the gain to track it more tightly.

https://user-images.githubusercontent.com/47559809/230114654-ff6c403e-1236-459a-b26c-f86b3c3ed79d.mp4

I've tried setting the W diagonal value corresponding to the neck roll to 1000.0 keeping the other to 0.1 and the redundant task to 0.0 for all but the results didn't change.

I've also tried setting the redundant task for the neck to redundantTask(0) = (0.0 - this->q(0)); keeping the other one at 0 and all the diagonal values of W to 0.1 and got similar results.

The joint limits are not reached for any joint except for the neck roll that goes to its upper limit and stays there.


The target is a point 20 cm in front of the robot rotating in a circle of radius 10cm

Woolfrey commented 1 year ago

Well, it's tracking the target at least?

Woolfrey commented 1 year ago

I'll take a look at the code tomorrow and see if I can figure out what's happening.

xEnVrE commented 1 year ago

Well, it's tracking the target at least?

It seems so! Also we should clarify that the "desired" signal in the plot is a bit weird because its evaluation uses also the feedback from the robot - as the origin of the RealSense frame from the forward kinematics is required to evaluate the desired line of sight.

andrearosasco commented 1 year ago

As suggested by @xEnVrE I have tried to manually set the neck roll joint limit to constrain it more but it loses ita ability to track the point

https://user-images.githubusercontent.com/47559809/230316224-1c1ed971-19a5-4ebf-bf88-961913dfc03a.mp4

The target is a point 20 cm in front of the robot rotating in a circle of radius 10cm

xEnVrE commented 1 year ago

I have tried to manually set the neck roll joint limit to constrain it more but it loses its ability to track the point

Which limits did you set now?

andrearosasco commented 1 year ago

In the video, it was completely constrained. I've also tried with 0.4 (I'm referencing the values observed in the yarpmotorinterface) and the tracking was slightly better. But every, the neck roll goes to its upper limit.

Woolfrey commented 1 year ago

The problem might be more fundamental. We should check the kinematics and the setup of the QP problem.

andrearosasco commented 1 year ago

I have the debugger set up so I can execute the code line by line and check the variables. If you have time can we do it together @Woolfrey?

Woolfrey commented 1 year ago

I have the debugger set up so I can execute the code line by line and check the variables. If you have time can we do it together @Woolfrey?

Whenever you're free. Just let me know. A quick test would be to solve without using the QP solver. Something like:

dq = J.transpose()*(J*J.transpose()).partialPivLu().solve(dx);

Although be warned this will not account for joint limits.

Woolfrey commented 1 year ago

Results of today's test on the ergoCub:

https://user-images.githubusercontent.com/62581255/230449725-f81c2faa-6e2f-40d5-939e-96ba4a1450db.mp4


So far we have tried:

The drift still happens, so we can conclude that the problem is not with the simulation, hardware, or QP solver. The problem is likely coming from the task definition:

\boldsymbol{\omega} = k\cdot\boldsymbol{\varepsilon} =  k\cdot\left(\mathbf{l\times p}\right)

This error definition has a null space as the head can rotate around the vector $\hat{\boldsymbol{\varepsilon}}$ freely without increasing or decreasing the error. The tilt of the head corresponds directly to the positive direction of rotation using the right hand rule.

Woolfrey commented 1 year ago

A suggestion for altering the task:

Define the axis of rotation to be:

\mathbf{\hat{a}} = \frac{\mathbf{l\times p}}{\|\mathbf{l\times p}\|}

and the angle as:

\theta = \arccos\left(\frac{\mathbf{l}^\mathrm{T}\mathbf{p}}{\|\mathbf{l}\|\cdot\|\mathbf{p}\|}\right)

Then:

\boldsymbol{\omega} = k\cdot\sin\left(\theta/2\right)\mathbf{\hat{a}}

And see what that does.

xEnVrE commented 1 year ago

For the time being, although a very crude solution, we could also decide to force dq(0) = 0.0, i.e. $\dot{q}_{roll} = 0.0$, independently from the solution given by the QP.

This way we get this:

https://user-images.githubusercontent.com/6014499/230477695-7c70433e-bebf-4cfc-bd97-2f64e3fefe70.mp4

Of course, we should solve the problem conceptually as well :D

xEnVrE commented 1 year ago

Another solution that I tested is adding a secondary term for the desired angular velocity.

Let $k_{neck}$ be the vertical axis of the neck:

image

Then, we define its projection on the ergoCub $y-z$ as $k{yz, neck}$. It can be simply obtained by setting the first component of $k{neck}$ to zero.

Then, we define a desired value for such direction as $k^{d}_{neck} = (0, 0, 1)$ - as we want the neck to be upright as much as possible.

The additional angular velocity command is then $\omega{neck} = k{yz, neck} \times k^{d}_{neck}$.

The final desired angular velocity becomes:

$$ \omega = k_1 \cdot\left(\mathbf{l\times p}\right) + k2 \omega{neck} $$

Although I think it would be better to solve the QP by having the term $\omega_{neck}$ in the cost function while now it is part of the constraint - that should be devoted to pointing task solely.

This is the result that I got - the roll is not arriving to the limits, although is still moving. $k_2$ might help decide how far it goes.

https://user-images.githubusercontent.com/6014499/230490776-82741367-7079-47dd-b6da-1fc59c17e391.mp4

xEnVrE commented 1 year ago

I tried to increase $k_2$ and I got:

https://user-images.githubusercontent.com/6014499/230492497-bd37234c-ccac-4e20-8a5e-92729102c830.mp4

Woolfrey commented 1 year ago

But is it tracking the target properly?

I believe the problem comes from using the cross product to define the error. I would try the solution I suggested before:

\boldsymbol{\omega} = k\cdot\sin\left(\theta/2\right)\mathbf{\hat{a}}

(although I suspect it will still have the same problem since its derived from the cross product...).

If that fails, I think I have another solution in mind that requires only 2DoF for the task definition. The line of sight is a 3D vector: $\mathbf{l}\in\mathbb{R}^3$, but any rotation around this vector is obsolete / redundant, so we've over-constrained the task by 1DoF. I will solve the math later today and post the solution here.

andrearosasco commented 1 year ago

I've tested @xEnVrE workaround to see if it impacted the tracking but it looks like it is working fine. I guess that, for now, I could stick with the workaround.

Video starts at 10s:

https://user-images.githubusercontent.com/47559809/230574706-d5eb8d4d-7995-47b4-831a-ca7c0a301f02.mp4

xEnVrE commented 1 year ago

I've tested @xEnVrE workaround

which one? enforcing dq(0) = 0?

andrearosasco commented 1 year ago

enforcing dq(0) = 0?

Yes, exactly

Woolfrey commented 1 year ago

OK, I've gone over the theory for visual servoing. Here's a basic summary (I'm typing up some more thorough notes). We need to define a point in the image plane:

\mathbf{p}_i = 
\begin{bmatrix}
x_i \\ y_i
\end{bmatrix} \in\mathbb{R}^2

As usual, we define the error as:

\mathbf{e}_i = \mathbf{p}_d - \mathbf{p}_i

such that the time derivative is:

\mathbf{\dot{e}}_i = -\mathbf{\dot{p}}_i

And we simply define the time derivative to force convergence:

\mathbf{\dot{p}}_i \triangleq k\cdot\mathbf{e}_i \Longrightarrow \mathbf{\dot{e}}_i = -k\cdot\mathbf{e}_i

The time derivative for the image plane point is related to the camera velocity via the image Jacobian $\mathbf{J}_I$:

\mathbf{\dot{p}}_i =
\underbrace{
\begin{bmatrix}
        -\mathrm{\frac{f}{z}} & 0 & \mathrm{\frac{x_i}{z}} & \mathrm{\frac{x_i y_i}{f}} & -\mathrm{\frac{1+x_i^2}{f}} &\mathrm{y_i} \\
        0 & -\mathrm{\frac{f}{z}} & \mathrm{\frac{y_i}{z}} & \mathrm{\frac{1+y_i^2}{f}} & -\mathrm{\frac{x_i y_i}{f}} & -\mathrm{x_i}
    \end{bmatrix}
}_{\mathbf{J}_I}
\begin{bmatrix} \mathbf{v} \\ \boldsymbol{\omega} \end{bmatrix}

where $f$ is the focal length(?). But we know the camera velocity is a function of the robot Jacobian, so we can write:

\mathbf{\dot{p}}_i = \mathbf{J}_I\mathbf{J}_R\mathbf{\dot{q}}

The combined matrix $\mathbf{J}_I\mathbf{J}_R\in\mathbb{R}^{2\times 4}$ has 2DoF of redundancy, so we can solve this as usual with least squares. All we need to do is get the image point $\mathbf{p}_i\in\mathbb{R}^2$ so we can formulate the image Jacobian $\mathbf{J}_I$. Setting $\mathbf{p}_d = \mathbf{0}$ should drive the point to the center of the image frame.

xEnVrE commented 1 year ago

I tested the approach suggested by @Woolfrey although with a slight difference, i.e. writing the Jacobian for a point of the image being at the center of the image plane, say $(w/2, h/2)$, where $w$ is the image width and $h$ is the image height.

In this case the Jacobian simplifies to

$$ J = J{I} M J{R} \in \mathbb{R}^{2 \times 4}, $$

where

J_{I} =
\begin{bmatrix}
        \frac{f_{x}}{d} & 0 & 0 & 0 & f_{x} & 0\\
        0 & \frac{f_{y}}{d} & 0 & -f_{y} & 0 & 0
    \end{bmatrix},
M = 
\begin{bmatrix}
R_{camera}^{T}(q) & 0\\
0 & R_{camera}^{T}(q)
\end{bmatrix},

$R(q)$ is the current orientation of the camera frame and $J{R}$ is the usual robot Jacobian. Moreover, $f{x}$ and $f_{y}$ are the focal parameters of the camera and $d$ is the corresponding depth of the pixel $(w/2, h/2)$.

Given that we are going to scale things anyways using a gain, I simplified the Jacobian $J_{I}$ to be:

J_{I} =
\begin{bmatrix}
        1 & 0 & 0 & 0 & 1 & 0\\
        0 & 1 & 0 & -1 & 0 & 0
    \end{bmatrix},

as what really matters are the directions in the twist space the angular velocities get mapped to.

The matrix $M$ is required as the visual servoing task is assuming that the camera velocities are expressed in camera frame while the Jacobian $J_{R}$, at least that from iDynTree given the current configuration of the software of @andrearosasco, is providing the linear and angular velocity in the robot frame.

As regards the error driving term I am using

$$ e = kp (p{d} - (w/2, h/2)), $$

where the $u$ and $v$ coordinates of $p_{d}$ are obtained from the 3D desired gazing point $(x_d, y_d, z_d)$ as

$$ p{d, u} = c{x} + \frac{x_d}{zd} f{x}, $$

$$ p{d, v} = c{y} + \frac{y_d}{zd} f{y}. $$

Here, $c_x$, $c_y$, $f_x$ and $f_y$ are the camera principal point and focal lengths in pixel.

Moreover, it should be noted that the desired gazing point $(x_d, y_d, zd)$ is assumed to be expressed in the camera frame and with respect to the camera origin. Instead, so far we agreed with @andrearosasco that such desired point was expressed in the robot frame and with respect to the robot origin, call it $(x{d, robot}, y{d, robot}, z{d, robot})$, as usual in iCub 2.x.

For that, given the current camera origin $c(q)$, the desired gazing point in camera frame can be obtained as

\begin{bmatrix}
x_{d}\\
y_{d}\\
z_{d}
\end{bmatrix}
= R(q)^{T} \left(
\begin{bmatrix}
x_{d, robot}\\
y_{d, robot}\\
z_{d, robot}
\end{bmatrix}
-c(q)\right).

Using the visual servoing task, it seems that now the additional task for the neck roll

$$ \dot{q}^{0}{roll} = k{roll} (0 - q_{roll}) $$

is working as expected :tada:

https://user-images.githubusercontent.com/6014499/230705622-72bf0bfb-2cb6-4963-b1c0-74cc9836e32c.mp4

xEnVrE commented 1 year ago

During #69 @Woolfrey let us notice that the matrix

J_{I} =
\begin{bmatrix}
        \frac{f_{x}}{d} & 0 & 0 & 0 & f_{x} & 0\\
        0 & \frac{f_{y}}{d} & 0 & -f_{y} & 0 & 0
    \end{bmatrix},

whose rank cannot be more than 2 given its shape, has two columns that are zero vectors.

Actually, I have made same simplifications, i.e. that $c_x \approx \frac{w}{2}$ and $c_y \approx \frac{h}{2}$, and this is causing the first zero column.

The complete matrix without the approximation is instead:

J_{I} =
\begin{bmatrix}
        \frac{f_{x}}{d} & 0 & \frac{-(w/2 - c_{x})}{d} & 0 & f_{x} & 0\\
        0 & \frac{f_{y}}{d} & \frac{-(h/2 - c_{y})}{d} & -f_{y} & 0 & 0
    \end{bmatrix},

which tells us that the role of the z velocity of the camera origin, in camera frame, is less and less important the more $c_x \to \frac{w}{2}$ and $c_y \to \frac{h}{2}$.

On the other hand, the last column cannot be different from zero in any case. Indeed, any angular velocity along the z axis of the camera, would not change the instantaneous velocity of the projection of a ray coming out of the camera at distance $d$ on the image plane.

I hope this explains better the reasons of the two zero columns.

Woolfrey commented 1 year ago

whose rank cannot be more than 2 given its shape, has two columns that are zero vectors.

You are correct. $rank(\mathbf{J}_I) = 2$, so it only needs two independent column vectors minimum.

My instinct was that there was an issue when seeing the zero columns, but logic wins out.

andrearosasco commented 1 year ago

I've implemented visual servoing and fixed the issue with the relative motion of the torso. When the torso move too much on one side it becomes hard for the head to properly track the point but apart from that it looks like it's working

https://user-images.githubusercontent.com/47559809/231514296-881ab120-0b73-4cab-85cd-e9c4df52068a.mp4

xEnVrE commented 1 year ago

Great! Given that the torso joint velocities might ve available on the robot we could also add a feed forward term on the 2D driving term taking into account the apparent motion that the torso motion creates on the image plane.

andrearosasco commented 1 year ago

As you've seen I have written a 3d Visualizer displaying the desired and actual line-of-sight vectors as 3d points.

https://user-images.githubusercontent.com/47559809/232521763-9123010d-f7bf-4fc5-97ce-da7cffc91c3b.mp4

xEnVrE commented 1 year ago

Today @andrearosasco @steb6 @SimoneMic and @xEnVrE tested the circular path tracking on the real ergoCub:

https://user-images.githubusercontent.com/6014499/234950498-16c26b6f-9ab6-4b88-ac18-a1d7dd3a3d88.mp4

https://user-images.githubusercontent.com/6014499/234950521-5f881cce-1962-481c-b9a8-830438295694.mp4

As it can be seen, the yaw seems fine while the pitch and roll are not.

image

We opened an issue for that: https://github.com/robotology/icub-tech-support/issues/1546

xEnVrE commented 1 year ago

New test made by @andrearosasco today:

https://user-images.githubusercontent.com/6014499/235186619-e23cd8c8-cec3-40e0-8a9a-ace98db2945e.mp4

Not that bad after all, just a bit woobly due to a not perfect tracking joints side:

image

Notes:

xEnVrE commented 1 year ago

This can be closed as the gaze controller performance is sufficient for the demo.