Describe the bug Operations are placed in a different order on different platforms. This can result in unexpected numerical issues in some cases.

Testing details:

Linux: Ubuntu 22.04, Python 3.8.16, x86
Mac: OSX Ventura 13.3.1, Python 3.8.16, Apple M2/ARM
Symforce v0.8.0, installed from pip

To Reproduce

Run the following script to generate a test function on both OSX and Ubuntu:
```
from pathlib import Path
```

import symforce symforce.set_symbolic_api('symengine') symforce.set_epsilon_to_number(1.0e-16)

from symforce import geo from symforce import symbolic as sm from symforce.codegen import Codegen, CppConfig from symforce import typing as T

def test_function(time: T.Scalar) -> geo.V4: rate = sm.pi / 6 angle = rate * time x_axis = geo.V3([-sm.cos(angle), -sm.sin(angle), 0.0]) z_axis = geo.V3([0, 0, 1]) y_axis = z_axis.cross(x_axis)

world_R_body = geo.Matrix33.column_stack(x_axis, y_axis, z_axis)
world_R_body_quat = geo.Rot3.from_rotation_matrix(R=world_R_body)
quat_elements = geo.V4(world_R_body_quat.to_storage())

world_R_body_subbed = world_R_body.subs(time, 3.01).evalf()
quat_elements_subbed = quat_elements.subs(time, 3.01).evalf()
print(world_R_body_subbed)
print(quat_elements_subbed)
return quat_elements

if name == 'main': cg = Codegen.function(func=test_function, config=CppConfig()) path = Path(file).parent.absolute() cg.generate_function(output_dir=path / 'output')

2. Observe the output of the python script (the values of `world_R_body_subbed` and `quat_elements_subbed`)

[0.00523596383141919, 0.999986292247427, 0.0] [-0.999986292247427, 0.00523596383141919, 0.0] [0.0, 0.0, 1.0]

[0.0] [0.0] [-0.705253158861618] [0.708955557080773]

2. Observe that the order of operations is different on the two platforms. This is the exact diff:
```diff
35,42c35,43
<   const Scalar _tmp4 = -_tmp1;
<   const Scalar _tmp5 = -std::max<Scalar>(1, std::max<Scalar>(_tmp4, _tmp3 + 1));
<   const Scalar _tmp6 = 1 - std::max<Scalar>(0, -(((_tmp4 + _tmp5) > 0) - ((_tmp4 + _tmp5) < 0)));
<   const Scalar _tmp7 = std::min<Scalar>(1, _tmp6);
<   const Scalar _tmp8 = std::min<Scalar>(_tmp6, 1 - std::max<Scalar>(0, _tmp7));
<   const Scalar _tmp9 = _tmp5 + 1;
<   const Scalar _tmp10 = std::min<Scalar>(1 - std::max<Scalar>(0, std::max<Scalar>(_tmp7, _tmp8)),
<                                          1 - std::max<Scalar>(0, -(((_tmp9) > 0) - ((_tmp9) < 0))));
---
>   const Scalar _tmp4 = _tmp3 + 1;
>   const Scalar _tmp5 = -_tmp1;
>   const Scalar _tmp6 = -std::max<Scalar>(1, std::max<Scalar>(_tmp4, _tmp5));
>   const Scalar _tmp7 = 1 - std::max<Scalar>(0, -(((_tmp5 + _tmp6) > 0) - ((_tmp5 + _tmp6) < 0)));
>   const Scalar _tmp8 = std::min<Scalar>(1, _tmp7);
>   const Scalar _tmp9 = std::min<Scalar>(_tmp7, 1 - std::max<Scalar>(0, _tmp8));
>   const Scalar _tmp10 =
>       std::min<Scalar>(1 - std::max<Scalar>(0, std::max<Scalar>(_tmp8, _tmp9)),
>                        1 - std::max<Scalar>(0, -(((_tmp6 + 1) > 0) - ((_tmp6 + 1) < 0))));
46,47c47,48
<       1 - std::max<Scalar>(0, std::max<Scalar>(_tmp10, std::max<Scalar>(_tmp7, _tmp8))),
<       1 - std::max<Scalar>(0, -(((_tmp3 + _tmp9) > 0) - ((_tmp3 + _tmp9) < 0))));
---
>       1 - std::max<Scalar>(0, std::max<Scalar>(_tmp10, std::max<Scalar>(_tmp8, _tmp9))),
>       1 - std::max<Scalar>(0, -(((_tmp4 + _tmp6) > 0) - ((_tmp4 + _tmp6) < 0))));
53,54c54,55
<   _res(0, 0) = (Scalar(1) / Scalar(2)) * _tmp7 * (1 - _tmp7);
<   _res(1, 0) = (Scalar(1) / Scalar(2)) * _tmp8 * (1 - _tmp8);
---
>   _res(0, 0) = (Scalar(1) / Scalar(2)) * _tmp8 * (1 - _tmp8);
>   _res(1, 0) = (Scalar(1) / Scalar(2)) * _tmp9 * (1 - _tmp9);

Run the C++ test code and evaluate at time=3.01:
```
#include <fmt/format.h>
```

include "output/cpp/symforce/sym/test_function.h"

int main() { const double time = 3.01; auto v = sym::TestFunction(time); fmt::print("time = {:>6}, v = {:>6}, {:>6}, {:>6}, {:>6}\n", time, v[0], v[1], v[2], v[3]); return 0; }

5. Observe that the numerical output is non-trivially different:

Mac version (resultant quaternion has zero norm): time = 3.01, v = 0, 0, 0, 0 Linux version (resultant quaternion matches python code): time = 3.01, v = 0, 0, -0.7052531588616179, 0.7089555570807733



NOTE: In this particular case, there may additionally be a bug in `Rot3.from_rotation_matrix` - the epsilon value is unused. This example just highlights how unpredictable behavior emerges from this issue.

**Expected behavior**
Ideally order of operations would be identical on all operating systems.

**Environment**
 - OS and version: Ubuntu 22.04, OSX Ventura
 - Python version: 3.8
 - SymForce Version 0.8

symforce-org / symforce

Generated operations are in a different order on different platforms #343

include "output/cpp/symforce/sym/test_function.h"