Open bmwiedemann opened 1 month ago
I guess it's something deep in lcms2 doing asm optimisations based on CPU features. I think the lcms2 mailing list might be a good place to ask if this is bug there.
On Tue, 08 Oct 2024 00:59:46 -0700 "Bernhard M. Wiedemann" @.***> wrote:
Such diffs can happen when memcpy or strcpy is used on overlapping memory regions. In that case, one of these result files would contain data corruptions. It can also happen if certain floating-point instructions are used that round differently.
Most likely it's rounding combined with compiler optimization. In GCC,
float
usage may cause that. Even in strict C standard mode there is
much freedom for how float
actually behaves. (AFAIK, the only language
standard that defines it really strictly is Fortran.) Modern FPU
hardware often works faster with double
, so GCC actually emits
double-precision instructions while compiling formally single-precision
C code. In contrast, most vector instructions are actually
single-precision due to limited size of registers. Switching on things
like AVX may actyally switch between double
and float
.
There is a GCC option that prevents emitting double-precision for
float
variables. Certain algorithms (i.e. Kahan–Babuška) don't work
properly unless this option is set. Look at -fexcess-precision,
-fassociative-math and others.
I tried to build both liblcms2 and colord with
-fexcess-precision=standard -fno-associative-math -Wuninitialized -Wmaybe-uninitialized -Wmissing-field-initializers
but there are still diffs left.
Could there be active CPU-detection in the lcms2 runtime?
Could there be active CPU-detection in the lcms2 runtime?
I'd say that's quite likely, there's some hand-tuned transform code IIRC.
I tried to build lcms2 with -DCMS_DONT_USE_SSE2=1 -DCMS_DONT_USE_FAST_FLOOR=1
but that did not help either.
The discussion in https://github.com/mm2/Little-CMS/issues/465 suggests that it might not be in lcms2. I could not find hints in argyllcms either.
Here is a diff of iccToXml output from colord-1.4.6 variations: https://rb.zq1.de/other/colord/FOGRA27L_coated.icc.xml.diff.txt
How/where are those wtpt and bkpt values computed?
I bisected with qemu's -cpu
param and found that all the difference comes from AVX2 (that first appeared in Intel Haswell in 2013)
With my reproducibleopensuse tools I do
cputype=Haswell cputype2=Haswell,avx2=off rbk
While working on reproducible builds for openSUSE, I found that our
colord
1.4.6 build output varied depending on the CPU model.I collected sets from these runs: https://rb.zq1.de/other/colord-cpu-dependent-output.tar.gz
The
2
one was built on an avx-enabled CPU and the1
one on akvm64
model without fancy new instructions.Such diffs can happen when memcpy or strcpy is used on overlapping memory regions. In that case, one of these result files would contain data corruptions. It can also happen if certain floating-point instructions are used that round differently.
Reproducer
both of these are stable, so there is no other randomness involved.
I tested that colord-1.4.7 still has the same issue.