Closed RussTedrake closed 8 years ago
Thanks for the advice! I will keep it in mind. In this case the test program intentionally injects random inputs into the system one thousand times in the hopes of catching differences between the two models.
I've narrowed down the problem to two areas:
body
link was not attached to the chassis_floor
link the same way in both models. The result was different inertia matrices between the two models (I should have seen that in the visualizations of the RigidBodyTrees above). Fixed in https://github.com/liangfok/drake/commit/5ee2ae5a91b52af8461791403386ade11739d623.loop joints
Random comment unrelated to your troubles, Liang: often the best way to deal with loops in an internal-coordinate multibody system is to split within a link rather than at the joint. This has the substantial advantage that all joints are then treated uniformly and angular coordinates can wrap past 2π without introducing extra state variables or hacks. It requires sawing a link into two half-mass bodies and gluing them back together with a weld (fixed) constraint (6 equations). This is a topic for another conversation but I thought I'd plant a seed now.
@sherm - very nice. That would indeed be cleaner. The current implementation is limited in that we cannot easily add an actuator directly to the “loop joint”, nor friction, etc. And we’re already adding 6 constraints for a revolute joint anyhow (should be 5, but was cleaner to implement as 6). Would you open a new issue on that? It would be a definite improvement!
Loop joint discussion moved to #1867.
I'm still digging into why the addition of loop joints is causing the URDF vs. SDF models of the Prius to be different. Here's what I've found so far:
The RigidBodyTree
visualizations between the two models are extremely close:
SDF:
URDF:
The difference is arising from the call to RigidbodySystem::dynamics(). Within this method, the differences do not arise until the call to prog.solve(). This method is defined in LeastSquares::solve(). Within this method, there is an Aeq
matrix and a beq
vector. The least squares solution is computed by the following code:
// least-squares solution
prog.setDecisionVariableValues(
Aeq.jacobiSvd(Eigen::ComputeThinU | Eigen::ComputeThinV).solve(beq));)
Using print statements, I determined that Aeq
are the same but beq
is different between the two models. The table below contains an example difference between the beq
vector:
Index | URDF | SDF |
---|---|---|
0 | 30.19623666357765401357938 | 30.19628053363987518764588 |
1 | -14.02833576585239505618574 | -14.02839659691058571411304 |
2 | 1.110223024625156540423632e-14 | 8.326672684688674053177238e-15 |
3 | 30.19623666357765401357938 | 30.19628053363987518764588 |
4 | -14.02833576585239505618574 | -14.02839659691058571411304 |
5 | 1.665334536937734810635448e-14 | 1.38777878078144567552954e-14 |
6 | 2.441663517175129527458921 | 2.441432350830609721015207 |
7 | 76.73089804700674676496419 | 76.73260586144398587293836 |
8 | 5.462316429263069039734546 | 5.46275105161032570322277 |
9 | -870.3200313010531772306422 | -870.3200313010531772306422 |
10 | 84.20119268094822473358363 | 84.20119268094822473358363 |
11 | -793.7666223647498782156617 | -793.7666223647498782156617 |
12 | 64216.42927706049522385001 | 64216.4299376551789464429 |
13 | -65183.48699485358520178124 | -0 |
14 | -0.009111866824832448585880229 | -0.009111866824846659440595431 |
15 | -77390.85823135306418407708 | -77390.85845732540474273264 |
16 | 0.5209504568581166950025363 | 0.5209504568581166950025363 |
17 | -2.131628207280300557613373e-14 | -0 |
18 | 1.68753899743023794144392e-14 | -2.664535259100375697016716e-15 |
In the above example, the problem appears to be with index 13 of the URDF's beq
vector. Currently looking into why.
Liang, a thought: this looks like a difference in rank determination by the SVD. The sdf version appears to have decided on a lower rank for what I presume is the constraint compliance matrix (is that right?). From the pictures it looks like the car model contains a planar linkage, which if modeled in 3d should have redundant out-of-plane constraints. I notice that there are some subtle differences in some of the other numbers, for example row 0 differs in the sixth decimal place. To me that suggests possible float vs. double treatment differences. The float version (urdf?) could have errors of 1e-6 or so which could prevent the SVD from dropping rank when it should.
It may help to also numerically check the individual properties (inertia, transforms, etc) to catch differences before getting to solve().
@sherm1, to determine the SVD rank, I can count the number of non-zero singular values right? If so, I believe the ranks of both models are the same. Here are example SVD singular values obtained using Aeq.jacobiSvd(Eigen::ComputeThinU | Eigen::ComputeThinV).singularValues()
from one execution of the rigid body model equivalency test that failed.
Index | URDF | SDF |
---|---|---|
0 | 243.6518650177665392675408 | 243.6530916319920834212098 |
1 | 210.9897112899952844600193 | 210.9906242300023393454467 |
2 | 120.4271656785780066911684 | 120.4271691118260037001164 |
3 | 120.1792493242776487250012 | 120.1792320269771039420448 |
4 | 119.5581933039784559014151 | 119.5581558735103726576199 |
5 | 34.2997489187578494806985 | 34.3000281990250641683815 |
6 | 3.052668011736161535196743 | 3.052718464389475361286941 |
7 | 2.106752996491387364130787 | 2.106747390436160660698306 |
8 | 1.425061883264294726458843 | 1.425096408193961439181408 |
9 | 0.7637255782890202127077828 | 0.7637495038471329156237744 |
10 | 0.4999120000000041863508216 | 0.4999120000000024099939822 |
11 | 0.4999120000000031316389482 | 0.4999119999999998564810255 |
12 | 0.4971264118704067369591826 | 0.4971264230915271031818747 |
13 | 0.4875517614159221024827673 | 0.4875518639757789873812044 |
14 | 0.04451646622718740536894799 | 0.04451400417051754043740175 |
15 | 3.719182395161172259100363e-21 | 2.319157834412946368918493e-21 |
16 | 1.420089482289595853461653e-23 | 1.263129089448864027256328e-24 |
17 | 4.728979668985963391983973e-32 | 6.281537206627940716786068e-31 |
18 | 5.007775934575039637495072e-33 | 1.884705025915192462402635e-32 |
I believe the slight differences between the URDF and SDF models are due to the need to convert from absolute to relative coordinate frames when parsing the SDF. Both the SDF and URDF parsers use the double
data type.
@RussTedrake, I will numerically check for differences probably by modifying RigidBodyTree to override the ==
operator.
Thanks for the advice!
to determine the SVD rank, I can count the number of non-zero singular values right? If so, I believe the ranks of both models are the same.
Yes, those are both convincingly rank 15 matrices!
I would not be surprised if the numerical differences are from the ascii representation of the doubles in the xml files, especially if the sdf was generated from a tool that loaded the urdf into a float/double then wrote the sdf with some potentially different accuracy.
I overloaded operator==
and operator!=
in RigidBodyTree
, RigidBody
, and RigidBodyFrame
and added a numerical equivalency test to compareRigidBodySystems.cpp. Here is the commit: https://github.com/liangfok/drake/commit/ba9027b6c8bfe66667e1055852a2a2481a7a45c3. The numerical equivalency test currently checks everything (inertia, COM, link names, joint names, joint limits, contact points, loop frames, actuators, gravity vector, B matrix, etc.) except for the collision model.
The URDF and SDF models of the Prius pass the current numerical equivalency test. Maybe this will change after comparing the collision models.
Are the dynamics the same if you put the initial conditions with the car up in the air? (e.g. set z=10)? In that case, the collision geometries should not effect the dynamics.
I think I initialized both models 10m up in the air by modifying x[2]
, which is index 2 of the initial state vector (see: https://github.com/liangfok/drake/commit/f867f2d0fa5f2ad0824b39e00628e1608392891c). I decided to use index 2 based on the output of a method I added called RigidBodySystem::getStateVectorSemantics()
, which I hope prints the correct order of elements in the state vector (see: https://github.com/liangfok/drake/commit/caab2ae6fe5cbc5f4afe55202e25a64c5b706bc2). Here is the output for both models:
Index | Joint name | Joint axis-of-motion name |
---|---|---|
0 | base | base_x |
1 | base | base_y |
2 | base | base_z |
3 | base | base_qw |
4 | base | base_qx |
5 | base | base_qy |
6 | base | base_qz |
7 | steering | steering |
8 | left_pin | left_pin |
9 | left_wheel_joint | left_wheel_joint |
10 | axle_tie_rod_arm | axle_tie_rod_arm |
11 | right_wheel_joint | right_wheel_joint |
12 | rear_left_wheel_joint | rear_left_wheel_joint |
13 | rear_right_wheel_joint | rear_right_wheel_joint |
14 | base | base_wx |
15 | base | base_wy |
16 | base | base_wz |
17 | base | base_vx |
18 | base | base_vy |
19 | base | base_vz |
20 | steering | steeringdot |
21 | left_pin | left_pindot |
22 | left_wheel_joint | left_wheel_jointdot |
23 | axle_tie_rod_arm | axle_tie_rod_armdot |
24 | right_wheel_joint | right_wheel_jointdot |
25 | rear_left_wheel_joint | rear_left_wheel_jointdot |
26 | rear_right_wheel_joint | rear_right_wheel_jointdot |
Unfortunately, the dynamics of the vehicles do not match. To see a larger difference, I increased the equivalency threshold of valuecheckMatrix()
from 1e-8 to 1 (see: https://github.com/liangfok/drake/commit/bbc764e839b707fb234b12d3a5f76e34802eff61). Here is an example of a mismatch:
Initial condition (x
):
[-0.11178, -1.03367, 10, 0.196684,-0.371481,0.0390621, -0.85062, -225.333, 166.725, -104.719, 197.192, -212.579, -67.8423, -17.4616, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Input (u
):
[0.994796, 0.832517, -0.767727]
Resulting xdot
vectors and their differences (this is the return value of RigidBodySystem::dynamics()
):
Index | SDF | URDF | Delta |
---|---|---|---|
0 | 0 | 0 | 0 |
1 | 0 | 0 | 0 |
2 | 0 | 0 | 0 |
3 | 0 | 0 | 0 |
4 | 0 | 0 | 0 |
5 | 0 | 0 | 0 |
6 | 0 | 0 | 0 |
7 | 0 | 0 | 0 |
8 | 0 | 0 | 0 |
9 | 0 | 0 | 0 |
10 | 0 | 0 | 0 |
11 | 0 | 0 | 0 |
12 | 0 | 0 | 0 |
13 | 0 | 0 | 0 |
14 | 154.407 | 154.414 | -0.00644637 |
15 | 35.7116 | 35.7128 | -0.00120793 |
16 | 783.281 | 783.295 | -0.0140741 |
17 | -271.505 | -271.498 | -0.00746939 |
18 | 1391.15 | 1391.18 | -0.0295634 |
19 | -14.7617 | -14.7617 | 6.13917e-05 |
20 | 22892.6 | 22894.3 | -1.73259 |
21 | -32930.2 | -32932.5 | 2.3233 |
22 | 95.563 | 95.5671 | -0.00410449 |
23 | -31217.4 | -31218.4 | 1.06835 |
24 | 127.887 | 127.892 | -0.00519207 |
25 | -35.7116 | -35.7128 | 0.00120793 |
26 | -35.7116 | -35.7128 | 0.00120793 |
I'm noticing some very large values in the xdot
vectors especially given the relatively small values in u
. Could we be hitting a singularity somewhere?
I also find strange the fact that all joint positions are zero in xdot
. Is this expected?
Quote from @sherm1:
From the pictures it looks like the car model contains a planar linkage, which if modeled in 3d should have redundant out-of-plane constraints.
I'm not sure if there are any planar linkages. Below is a zoomed-in view of the front steering links in both the URDF and SDF models. Notice the tie rod and axle are above the left and right tie rod arms. Is this where you were thinking there may be some planar linkages with redundant out-of-plane constraints?
Is this where you were thinking there may be some planar linkages with redundant out-of-plane constraints?
Yes -- the shift in z direction would still leave this as a planar mechanism. So if the joints were all pin (revolute) joints there would be redundant constraints. That is, you could remove one of the pins and the mechanism would still move only in the x-y plane, so no additional z constraint was needed. Nothing wrong with that provided that the linear algebra being employed is prepared for it.
@sherm1, thanks for the explanation. Now I understand what is meant by "planar mechanism"!
I modified the RigidBodyTree
numerical equivalency checker to also compare the collision models (see: https://github.com/liangfok/drake/commit/ff3b20cda4bf9515f2baab0222c392e4faca5fb7). It indicated that the two collision models were the same even though the resulting dynamics were different. This is consistent with the earlier test where the models were elevated 10m in the air to prevent collision geometries from impacting vehicle dynamics, and still observing differences in the dynamics. Thus, the mystery of why the URDF and SDF Prius models exhibit different dynamics continues.
To attain a greater level of repeatability during this debugging process, I hard-coded a known failure case (see: https://github.com/liangfok/drake/commit/6223ff833d634b7f920bec999f24257e181ac751) and will use it to further investigate why seemingly identical rigid body tree models result in different dynamics.
I finally got the Prius URDF and SDF Models to match. Ultimately, the solution was to manually adjust the two models so that the error between them was less than std::numeric_limits<double>::epsilon()
. There's an argument that the tolerance value used in compareRigidBodySystems.cpp should be relative rather than absolute (see https://github.com/RobotLocomotion/drake/issues/1887#issuecomment-198805278) -- I will leave that to a future issue. Below are some details of the fix.
operator==
overload that resulted in joints not being comparedstd::numeric_limits<double>::epsilon()
std::numeric_limits<double>::epsilon()
TODO:
Great news!
I don't understand exactly what you mean by adjusting the pose manually? Could you explain at which step/file you mean?
SDF contains <pose>
tags that specify a link's position and orientation in the model frame and a joint's position and orientation in its child link's frame. I tuned them so the transforms in Drake's internal models match. Here is the specific commit: https://github.com/liangfok/drake/commit/c66a1db5b790ca961d7a9f56028ea849d6f494bb
Hand-tuned, or computed more accurately?
I printed the transforms with 25-digit precision and then used a calculator to compute the desired
I originally converted the SDF to URDF using an automated tool (http://wiki.ros.org/pysdf) but still had to make many changes to both the resulting URDF and original SDF to make them perfectly match.
We may want to develop our own URDF/SDF converter to fully automate the process in the future.
Calculator? :)
Great work. Thanks for pushing it through to a satisfying conclusion. Looking forward to the PR.
No problem! One problem with all the manual-tuning-using-a-calculator is the vehicle model is no longer symmetric across the sagittal plane. For example, below is the pose of the front left and right wheels:
Link | x | y | z | r | p | y |
---|---|---|---|---|---|---|
Right wheel | 1.40948 | -0.80198 | 0.3164819999999999855511135 | 1.57079632679 | 0 | 0 |
Left wheel | 1.40948 | 0.801993 | 0.3164819999999999855511135 | 1.57079632679 | 0 | 0 |
Delta | 0 | 1.3e-5 | 0 | 0 | 0 | 0 |
Since the models match and simulation works, I created a new issue tracking the work of modifying the models to be symmetric: https://github.com/RobotLocomotion/drake/issues/1896.
In #1711. i'd like for the dynamics to match exactly, so that we can use this as another unit test. But right now, in addition to not matching the prius.sdf model actually goes unstable when used for simulation. So there is something fundamentally wrong with either the sdf or the parser. This one is urgent. :)