This PR fixes the rounding errors on 32-bit Intel CPUs that were originally reported against 4.4 , but are still present in 5.x.
Fixes: #1192
For the rationale and discussion, please refer to this issue.
There are probably more places in the source where rounding the results properly will make sense, but this patch will at least fix all the unit tests that currently fail on the x87 FPU. Even if it isn't merged, it may still be useful as a future reference when optimizing and refactoring the core.
Type of change
[x] Bug fix (non-breaking change which fixes an issue)
[ ] New feature (non-breaking change which adds functionality)
How Has This Been Tested?
[x] Run unit tests on amd64
[x] Run unit tests on i686
[x] Compare resulting assembly between amd64 and i686
Test Configuration:
Operating System: Debian Linux trixie/sid
CPU: amd64 and i686 (in a 32-bit chroot)
Checklist:
[x] My code follows the style guidelines of this project as described in UltiMaker Meta
Some additional information about what impact is expected from the patch:
std::llrint(f) = round to nearest integer, processes a double or float value and outputs a long long.
On SSE2 CPUs, gcc will emit a CVTSD2SI instruction. With the former code (where the result was always truncated towards 0), a CVTTSD2SI instruction was issued instead. Both instructions have exactly the same performance characteristics, so there is no runtime impact expected on x86_64 type CPUs.
Other CPU architectures may choose a slower implementation due to differences in floating-point unit capabilities. I verified i686, ARMv7-A and aarch64 (ARMv8-A/ARM64) with gcc 13. With -ffast-math, both the aarch64 and i686 machine code directly use fast FPU rounding instructions. Without -ffast-math, a library function call is generated, but the library function does the same thing, so the runtime impact is expected to be low. On ARMv7 a complex library function is called (which may or may not have a fast path for capable CPUs), so the runtime impact can be high in this case.
A CPU and/or C/C++ runtime is in a specific rounding mode for tie-breaking cases (such as close to .5). When not set explicitly, the C99 standard defines FE_TONEAREST as the default, which normally corresponds to standard IEEE-754 round-to-nearest-even behavior. There can be differences between CPU architectures or runtimes. In practice, this would be hardly relevant, because CuraEngine normally doesn't handle extremely small or extremely large floating-point values.
Description
This PR fixes the rounding errors on 32-bit Intel CPUs that were originally reported against 4.4 , but are still present in 5.x.
Fixes: #1192 For the rationale and discussion, please refer to this issue.
There are probably more places in the source where rounding the results properly will make sense, but this patch will at least fix all the unit tests that currently fail on the x87 FPU. Even if it isn't merged, it may still be useful as a future reference when optimizing and refactoring the core.
Type of change
How Has This Been Tested?
Test Configuration:
Checklist: