rotations.axis_angle_from_matrix() inaccuracy at angles close to zero or pi

mojodo commented 4 years ago

The y=arccos(x) function results in numerical imprecision when y should be close to 0 or pi, i.e. when x is close to 1 or x is close to -1.

To circumvent this you could for example write the following:

# determine angle
angle = np.arccos((np.trace(R) - 1.0) / 2.0)

if angle < 0.001: # get numerical precision at low angular values
   angle = np.arcsin(np.linalg.norm(R-np.eye(3))/np.sqrt(2))
   if angle == 0.0:
        return np.array([1.0, 0.0, 0.0, 0.0])

a = np.empty(4)
D = R-R.transpose()

if angle > np.pi-0.001:# get numerical precision at high angular values
    angle = np.pi - np.arcsin(np.linalg.norm(D)/np.sqrt(8))

# determin rotation axis
r = np.array([D[2, 1], D[0, 2], D[1, 0]])
# The norm of r is 2.0 * np.sin(angle)
a[:3] = r / (2.0 * np.sin(angle))
a[3] = angle
return a

AlexanderFabisch commented 4 years ago

Hi @mojodo ,

thanks for your feedback.

The arccos(x) function results in numerical imprecision when x is close to 0 or pi.

I am confused. arccos is only defined for -1 <= x <= 1.

Do you have an example of which operation fails and how?

mojodo commented 4 years ago

Thanks for asking back. You can try out with any rotation Matrix whoes rotation angle is close to 0 or close to pi. Another issue in your code is the special case that you consider when angle is close to pi. In this case you compute the components of the rotation axis by some positive square roots. By this you implicitly assume that the signs of the three components are all either positive or negative. However in general their signs may be different. That is why I suggested the method above.

To ensure that a function works properly in all situations it is good advice, to test them with the most critical cases. for this function this would be, rotation by angle=0 and 0+ epsilon, rotation by angle pi and pi-epsilon, rotation arround axis with differently signed components. This can be done by a set of say 1000 realizations of random rotation matrices spanning the entire space of possible rotations.

To be true, I did not make a rigorous test for my proposed function above.

AlexanderFabisch commented 4 years ago

Thanks for asking back. You can try out with any rotation Matrix whoes rotation angle is close to 0 or close to pi.

Please help me here and give me a concrete code snippet that fails (minimal, reproducible example). No matter how I set the rotation angles, I cannot create any problem.

import numpy as np
import pytransform3d.rotations as pr
R = pr.matrix_from_euler_xyz([0.000001232, np.pi - np.finfo("float").eps, 0.0])
pr.axis_angle_from_matrix(R)

Another issue in your code is the special case that you consider when angle is close to pi. In this case you compute the components of the rotation axis by some positive square roots. By this you implicitly assume that the signs of the three components are all either positive or negative. However in general their signs may be different. That is why I suggested the method above.

You are probably referring to this:

https://github.com/rock-learning/pytransform3d/blob/ba3b7c39cc2c3232dca1d72738a96ff81ee45e83/pytransform3d/rotations.py#L594

The functions are applied to all diagonal elements of R individually. The diagonal elements of R are certainly within [-1, 1]. If I add 1 and multiply by 0.5, then the result is a range of [0, 1] and I can compute the square root.

To ensure that a function works properly in all situations it is good advice, to test them with the most critical cases. for this function this would be, rotation by angle=0 and 0+ epsilon, rotation by angle pi and pi-epsilon, rotation arround axis with differently signed components. This can be done by a set of say 1000 realizations of random rotation matrices spanning the entire space of possible rotations.

There is test code here:

https://github.com/rock-learning/pytransform3d/blob/ba3b7c39cc2c3232dca1d72738a96ff81ee45e83/pytransform3d/test/test_rotations.py#L268-L302

Do you have any concrete suggestions? The problem is that if I cannot reproduce the problem, then I cannot make sure that the code that I add makes sense and I cannot automatically test it during continuous integration.

mojodo commented 4 years ago

Here is a rotation matrix specified via axis-angle that results in a failure:

a=np.ones((4,))
a[0] = -1
a[:3]= a[:3]/np.linalg.norm(a[:3])
a[3] = np.pi-(5e-8)
R=pr.matrix_from_axis_angle(a)
a_hat = pr.axis_angle_from_matrix(R)

a_hat yields array([0.57735027, 0.57735027, 0.57735027, 3.1415926 ]) while 'a' is [ -0.57735027, 0.57735027, 0.57735027, 3.1415926 ]

I have overseen some errors in my suggested code above as well. For the moment, determining the signs of the axis signs by the skew symmetric matrix of R seems to solve the issue:

  def axis_angle_from_matrix(R):
    angle = np.arccos((np.trace(R) - 1.0) / 2.0)

    if angle == 0.0:
        return np.array([1.0, 0.0, 0.0, 0.0])

    a = np.empty(4)
    r = np.array([R[2, 1] - R[1, 2], R[0, 2] - R[2, 0], R[1, 0] - R[0, 1]])
    if abs(angle - np.pi) < 1e-4:
        a[:3] = np.sqrt(0.5 * (np.diag(R) + 1.0))
        # The numerical values of a[:3] are quite accurate, however the
        # signs of the a[:3] components are always positive and thus in general wrong
        # The signs can be read out from the skew symmetric part of R, even tough the
        # numerical accuracy of the skew symmetric part of R is poor:
        a[:3]=a[:3]*np.sign(r)
    else:
        # The norm of r is 2.0 * np.sin(angle)
        a[:3] = r / (2.0 * np.sin(angle))
    a[3] = angle
    return a

By playing with the critical closeness of angle to pi, I found that 1e-4 gives two orders of magnitude better accuracy than 1e-7.

AlexanderFabisch commented 4 years ago

Here is a rotation matrix specified via axis-angle that results in a failure:
a=np.ones((4,))
a[0] = -1
a[:3]= a[:3]/np.linalg.norm(a[:3])
a[3] = np.pi-(5e-8)
R=pr.matrix_from_axis_angle(a)
a_hat = pr.axis_angle_from_matrix(R)
a_hat yields array([0.57735027, 0.57735027, 0.57735027, 3.1415926 ]) while 'a' is [ -0.57735027, 0.57735027, 0.57735027, 3.1415926 ]

Perfect.

I implemented your suggested fix and made pull request #44 . Would you like to review it?

By playing with the critical closeness of angle to pi, I found that 1e-4 gives two orders of magnitude better accuracy than 1e-7.

Can you explain that more detailed? How did you test this?

mojodo commented 4 years ago

Current version in your pull request cares for the most pressing problem. Still I believe, the accuracy can be further tuned.

I tested with rotations measured by a motion capture system. Inaccuracies only arise when when angle is close to pi. I minimised those by variation of your constant eps to 1e-4.

A suggestion for your test is to add an additional random Test with angles close and very close to pi.

AlexanderFabisch commented 4 years ago

OK, I needed an a bit more rigorous and reproducible approach to determine this threshold.

What I came up with is this (in a jupyter notebook):

%pylab inline
from pytransform3d.rotations import *

def axis_angle_from_matrix(R, eps=1e-7):
    """Compute axis-angle from rotation matrix.

    This operation is called logarithmic map.

    Parameters
    ----------
    R : array-like, shape (3, 3)
        Rotation matrix

    Returns
    -------
    a : array-like, shape (4,)
        Axis of rotation and rotation angle: (x, y, z, angle). The angle is
        constrained to [0, pi].
    """
    R = check_matrix(R)
    angle = np.arccos((np.trace(R) - 1.0) / 2.0)

    if angle == 0.0:
        return np.array([1.0, 0.0, 0.0, 0.0])

    a = np.empty(4)

    axis_unnormalized = np.array(
        [R[2, 1] - R[1, 2], R[0, 2] - R[2, 0], R[1, 0] - R[0, 1]])

    if abs(angle - np.pi) < eps:
        # standard formula becomes numerically unstable as derivative of
        # arccos approaches infinity
        a[:3] = np.sqrt(0.5 * (np.diag(R) + 1.0)) * np.sign(axis_unnormalized)
    else:
        # The norm of axis_unnormalized is 2.0 * np.sin(angle)
        a[:3] = axis_unnormalized / (2.0 * np.sin(angle))

    a[3] = angle
    return a

plt.figure(figsize=(20, 10))
diffs_to_pi = np.logspace(-7, -1, 101)
eps_candidates = np.logspace(-7, -3, 9)
for plot_idx, eps in enumerate(eps_candidates):
    ax = subplot(3, 3, plot_idx + 1)
    axis_dists = []
    for diff_to_pi in diffs_to_pi:
        a = np.array([1., 1., 1., np.pi - diff_to_pi])
        a[:3] = a[:3] / np.linalg.norm(a[:3])

        R = matrix_from_axis_angle(a)
        a2 = axis_angle_from_matrix(R, eps)

        axis_dist = np.linalg.norm(a[:3] - a2[:3])
        axis_dists.append(axis_dist)
    plot(diffs_to_pi, axis_dists, label="eps = %g" % eps)

    if plot_idx > 5:
        ax.set_xlabel("$x = \\pi - $ angle")
    if plot_idx % 3 == 0:
        ax.set_ylabel("error = $||a[:3]$ - $\\ln(\\exp$ (a))[:3]$||_2$")
    ax.set_xscale("log")
    ax.set_yscale("log")
    ax.set_ylim((10e-1, 10e-16))
    ax.legend()
plt.tight_layout()
plt.savefig("eps.png")

The result:

eps

So, indeed, eps=1e-4 seems to be just about perfect. I will include it and I will also see what happens with values close to 0.

AlexanderFabisch commented 4 years ago

For angles close to 0 I had to use another trick. I computed the exact norm of the unnormalized rotation axis to normalize it. Seems to work. Do you have a better suggestion?

mojodo commented 4 years ago

Nice Plots and thank you for confirming the choice of eps. Your normalization step to improve accuracy for angles close to pi appears to me a very good Idea. Thank you for sharing it.

By the way, the same could be helpful when the angle is very close to pi. As your np.sqrt(0.5 * (np.diag(R) + 1.0)) trick, which I find very elegant, may result in a non unit length vector due to numerical inaccuracies

AlexanderFabisch commented 4 years ago

By the way, the same could be helpful when the angle is very close to pi. As your np.sqrt(0.5 * (np.diag(R) + 1.0)) trick, which I find very elegant, may result in a non unit length vector due to numerical inaccuracies

Yes, it is used for normalization in both cases. Now I'm only a bit concerned about the accuracy of the angle which is always computed with arccos. Is there a better way?

I also cross-checked with other sources: [1], [2], [3]. They seem to confirm that our method is indeed correct, although [1] does not consider the sign ambiguity that you discovered (which is confirmed by [2]) and [3] has a very complicated way of handling singularities with many special cases.

mojodo commented 4 years ago

To adress the problem with the arccos at small angles I have one non tested Idea: The way one extracts the angle from the matrix could perhaps be improved for angles very close to zero or very close to pi. In these cases, the off diagonal elements are almost completely determined by the skew symmetric components of R which are proportional to sin(angle). Hence I suspect that for very small angles -- what very small is, needs to be identified in a similar test like you did in jupyter notebook -- the following should perform excellently:

r = np.array([R[2, 1] - R[1, 2], R[0, 2] - R[2, 0], R[1, 0] - R[0, 1]])
angle = np.linalg.norm(r)/2

For angles close to pi, my suspection is, that the rotation matrix itself yields a bad representation of the rotation. This means that accurate information aboutbthe rotation angle gets already lost when an axis angle representation is converted into a matrix. When you think of the following representation of the rotation matrix R(u,phi)= Icos(phi) + uu^T (1-cos(phi) + u_xsin(phi), where I ist the unity matrix and u_x is the skew symmetric matrix, and you substitute phi = pi-eps cos(phi) = -cos(eps) = 1-eps^2/2 sin(phi) = sin(eps)=eps you get R(u,phi)= -I(1-eps^2/2) + uu^T(2-eps^2/2) + u_xeps From this representation we find out that the information about the angle eps is definitely contained in the skew symmetric part of R. However the skew symmetric part is represented by the off diagonal elements of R and these are dominated by the symmetric part 2uu^T. Hence the feard cancellation has already been taken place, while the matrix R has been composed. The consequence seems to be, that in general, matrices are bad representations for rotations.

mojodo commented 4 years ago

I took the time to write this down in clean form, as it seems to be an important finding that floating point matrices are bad representats of orientation when the rotation angle is in the vicinity of pi. So the numerical needle eye seems discovered and not be due to numerical issues of the arccos function itself. The following expression allows for a good insight into some important numerical properties of the rotation matrix: grafik

mojodo commented 4 years ago

Futhermore:

grafik grafik

AlexanderFabisch commented 4 years ago

Thanks a lot. Looks interesting. I don't have time to take a closer look at the moment though.

mojodo commented 4 years ago

That is all right. Take your time and let me know when you are ready to continue our inspiring exchange.

AlexanderFabisch commented 4 years ago

Let me summarize what I understood:

For small angles we can use angle = np.linalg.norm(r)/2 (<- arcsin is missing here).
We don't have a better way to reconstruct the angle if it is close to pi because the inaccuracy comes from the rotation matrix representation.
For angles close to pi we can directly read the signs from the rotation matrix (because +I won't change off-diagonal components?) without indirection through the skew symmetric matrix. - Question: Is this a better way? Should I change the implementation?

Now I compared the two ways (via arcsin and arccos) of computing the angle - with strange results.

For angles close to pi the arcsin implementation works unreasonably well:

r = np.array([R[2, 1] - R[1, 2], R[0, 2] - R[2, 0], R[1, 0] - R[0, 1]])
angle = np.arcsin(np.linalg.norm(r) / 2.0)

As demonstrated by the following plot and mean, median, and max error of 2.536e-18, 0.0, and 2.220e-16 respectively.

pi_angles

For small angles I noticed that the following leads to an error of about pi:

r = np.array([R[2, 1] - R[1, 2], R[0, 2] - R[2, 0], R[1, 0] - R[0, 1]])
angle = np.arcsin(np.linalg.norm(r) / 2.0)

So, I changed it to:

r = np.array([R[2, 1] - R[1, 2], R[0, 2] - R[2, 0], R[1, 0] - R[0, 1]])
angle = np.pi - np.arcsin(np.linalg.norm(r) / 2.0)

This is the result:

small_angles

With a mean, median, and max error of 1.319e-17, 0.0, and 4.441e-16 for pi - arcsin(...)!

Now, for only for some axes (e.g. [sqrt(1/3), sqrt(1/3), sqrt(1/3)]) in combination with an angle close to zero I also have to apply this "subtract-from-pi" trick...

Let me formally express my genuine lack of understanding: WTF?!

At the moment I think the solution from the pull request looks fine so far and I will merge it. The additional correction of the angle is a bit half-baked. I also wouldn't know how to determine which solution to take since we don't know which one is better because we don't know the real angle. It is a good idea though.

AlexanderFabisch commented 4 years ago

Here is my test code.

mojodo commented 4 years ago

Hallo, verry interesting result. I will take a look at it some days later. I am sorry, that at the moment i am much in involved in projects and cannot furter investate for now.

AlexanderFabisch commented 4 years ago

Yes, no problem.

AlexanderFabisch commented 3 years ago

I found a similar threshold in this paper: https://arxiv.org/pdf/1606.05285.pdf

Different conversions though. I also tested whether there is a numerical problem in pytransform3d with these conversions. It was not the case. I couldn't get a better result with their small angle approximations.

I would close this issue. If there is something else to discuss, we can move this issue to a discussion.

dfki-ric / pytransform3d

rotations.axis_angle_from_matrix() inaccuracy at angles close to zero or pi #43