Round some results to nearest int instead of truncating

Ultimaker / CuraEngine

Powerful, fast and robust engine for converting 3D models into g-code instructions for 3D printers. It is part of the larger open source project Cura.

GNU Affero General Public License v3.0

1.67k stars 880 forks source link

Description

This PR fixes the rounding errors on 32-bit Intel CPUs that were originally reported against 4.4 , but are still present in 5.x.

Fixes: #1192 For the rationale and discussion, please refer to this issue.

There are probably more places in the source where rounding the results properly will make sense, but this patch will at least fix all the unit tests that currently fail on the x87 FPU. Even if it isn't merged, it may still be useful as a future reference when optimizing and refactoring the core.

Thanks for merging! :+1:

Some additional information about what impact is expected from the patch:

std::llrint(f) = round to nearest integer, processes a double or float value and outputs a long long.
On SSE2 CPUs, gcc will emit a CVTSD2SI instruction. With the former code (where the result was always truncated towards 0), a CVTTSD2SI instruction was issued instead. Both instructions have exactly the same performance characteristics, so there is no runtime impact expected on x86_64 type CPUs.
Other CPU architectures may choose a slower implementation due to differences in floating-point unit capabilities. I verified i686, ARMv7-A and aarch64 (ARMv8-A/ARM64) with gcc 13. With -ffast-math, both the aarch64 and i686 machine code directly use fast FPU rounding instructions. Without -ffast-math, a library function call is generated, but the library function does the same thing, so the runtime impact is expected to be low. On ARMv7 a complex library function is called (which may or may not have a fast path for capable CPUs), so the runtime impact can be high in this case.
A CPU and/or C/C++ runtime is in a specific rounding mode for tie-breaking cases (such as close to .5). When not set explicitly, the C99 standard defines FE_TONEAREST as the default, which normally corresponds to standard IEEE-754 round-to-nearest-even behavior. There can be differences between CPU architectures or runtimes. In practice, this would be hardly relevant, because CuraEngine normally doesn't handle extremely small or extremely large floating-point values.

Ultimaker / CuraEngine

Round some results to nearest int instead of truncating #1914

Description

Type of change

How Has This Been Tested?

Checklist: