symforce-org / symforce

Fast symbolic computation, code generation, and nonlinear optimization for robotics
https://symforce.org
Apache License 2.0
1.41k stars 145 forks source link

Generated operations are in a different order on different platforms #343

Open gareth-cross opened 1 year ago

gareth-cross commented 1 year ago

Describe the bug Operations are placed in a different order on different platforms. This can result in unexpected numerical issues in some cases.

Testing details:

To Reproduce

  1. Run the following script to generate a test function on both OSX and Ubuntu:
    
    from pathlib import Path

import symforce symforce.set_symbolic_api('symengine') symforce.set_epsilon_to_number(1.0e-16)

from symforce import geo from symforce import symbolic as sm from symforce.codegen import Codegen, CppConfig from symforce import typing as T

def test_function(time: T.Scalar) -> geo.V4: rate = sm.pi / 6 angle = rate * time x_axis = geo.V3([-sm.cos(angle), -sm.sin(angle), 0.0]) z_axis = geo.V3([0, 0, 1]) y_axis = z_axis.cross(x_axis)

world_R_body = geo.Matrix33.column_stack(x_axis, y_axis, z_axis)
world_R_body_quat = geo.Rot3.from_rotation_matrix(R=world_R_body)
quat_elements = geo.V4(world_R_body_quat.to_storage())

world_R_body_subbed = world_R_body.subs(time, 3.01).evalf()
quat_elements_subbed = quat_elements.subs(time, 3.01).evalf()
print(world_R_body_subbed)
print(quat_elements_subbed)
return quat_elements

if name == 'main': cg = Codegen.function(func=test_function, config=CppConfig()) path = Path(file).parent.absolute() cg.generate_function(output_dir=path / 'output')

2. Observe the output of the python script (the values of `world_R_body_subbed` and `quat_elements_subbed`)

[0.00523596383141919, 0.999986292247427, 0.0] [-0.999986292247427, 0.00523596383141919, 0.0] [0.0, 0.0, 1.0]

[0.0] [0.0] [-0.705253158861618] [0.708955557080773]

2. Observe that the order of operations is different on the two platforms. This is the exact diff:
```diff
35,42c35,43
<   const Scalar _tmp4 = -_tmp1;
<   const Scalar _tmp5 = -std::max<Scalar>(1, std::max<Scalar>(_tmp4, _tmp3 + 1));
<   const Scalar _tmp6 = 1 - std::max<Scalar>(0, -(((_tmp4 + _tmp5) > 0) - ((_tmp4 + _tmp5) < 0)));
<   const Scalar _tmp7 = std::min<Scalar>(1, _tmp6);
<   const Scalar _tmp8 = std::min<Scalar>(_tmp6, 1 - std::max<Scalar>(0, _tmp7));
<   const Scalar _tmp9 = _tmp5 + 1;
<   const Scalar _tmp10 = std::min<Scalar>(1 - std::max<Scalar>(0, std::max<Scalar>(_tmp7, _tmp8)),
<                                          1 - std::max<Scalar>(0, -(((_tmp9) > 0) - ((_tmp9) < 0))));
---
>   const Scalar _tmp4 = _tmp3 + 1;
>   const Scalar _tmp5 = -_tmp1;
>   const Scalar _tmp6 = -std::max<Scalar>(1, std::max<Scalar>(_tmp4, _tmp5));
>   const Scalar _tmp7 = 1 - std::max<Scalar>(0, -(((_tmp5 + _tmp6) > 0) - ((_tmp5 + _tmp6) < 0)));
>   const Scalar _tmp8 = std::min<Scalar>(1, _tmp7);
>   const Scalar _tmp9 = std::min<Scalar>(_tmp7, 1 - std::max<Scalar>(0, _tmp8));
>   const Scalar _tmp10 =
>       std::min<Scalar>(1 - std::max<Scalar>(0, std::max<Scalar>(_tmp8, _tmp9)),
>                        1 - std::max<Scalar>(0, -(((_tmp6 + 1) > 0) - ((_tmp6 + 1) < 0))));
46,47c47,48
<       1 - std::max<Scalar>(0, std::max<Scalar>(_tmp10, std::max<Scalar>(_tmp7, _tmp8))),
<       1 - std::max<Scalar>(0, -(((_tmp3 + _tmp9) > 0) - ((_tmp3 + _tmp9) < 0))));
---
>       1 - std::max<Scalar>(0, std::max<Scalar>(_tmp10, std::max<Scalar>(_tmp8, _tmp9))),
>       1 - std::max<Scalar>(0, -(((_tmp4 + _tmp6) > 0) - ((_tmp4 + _tmp6) < 0))));
53,54c54,55
<   _res(0, 0) = (Scalar(1) / Scalar(2)) * _tmp7 * (1 - _tmp7);
<   _res(1, 0) = (Scalar(1) / Scalar(2)) * _tmp8 * (1 - _tmp8);
---
>   _res(0, 0) = (Scalar(1) / Scalar(2)) * _tmp8 * (1 - _tmp8);
>   _res(1, 0) = (Scalar(1) / Scalar(2)) * _tmp9 * (1 - _tmp9);
  1. Run the C++ test code and evaluate at time=3.01:
    
    #include <fmt/format.h>

include "output/cpp/symforce/sym/test_function.h"

int main() { const double time = 3.01; auto v = sym::TestFunction(time); fmt::print("time = {:>6}, v = {:>6}, {:>6}, {:>6}, {:>6}\n", time, v[0], v[1], v[2], v[3]); return 0; }

5. Observe that the numerical output is non-trivially different:

Mac version (resultant quaternion has zero norm): time = 3.01, v = 0, 0, 0, 0 Linux version (resultant quaternion matches python code): time = 3.01, v = 0, 0, -0.7052531588616179, 0.7089555570807733



NOTE: In this particular case, there may additionally be a bug in `Rot3.from_rotation_matrix` - the epsilon value is unused. This example just highlights how unpredictable behavior emerges from this issue.

**Expected behavior**
Ideally order of operations would be identical on all operating systems.

**Environment**
 - OS and version: Ubuntu 22.04, OSX Ventura
 - Python version: 3.8
 - SymForce Version 0.8
aaron-skydio commented 1 year ago

This is a weird one - we had similar issues in the past when SymEngine was built with different implementations of std::unordered_map. I'm a little surprised we haven't seen this, we should definitely fix

NOTE: In this particular case, there may additionally be a bug in Rot3.from_rotation_matrix - the epsilon value is unused. This example just highlights how unpredictable behavior emerges from this issue.

Yep, good callout - this is not a bug but is definitely surprising - our current implementation of from_rotation_matrix doesn't require an epsilon, but we still have the argument mostly to not break the API. In general we aim for our generated code to be identical everywhere, and treat it as a bug if it's not, regardless of whether evaluated numerical results are meaningfully different