hughsie / colord

Making color management just work
GNU General Public License v2.0
75 stars 51 forks source link

ICC vary from CPU-type #174

Open bmwiedemann opened 1 month ago

bmwiedemann commented 1 month ago

While working on reproducible builds for openSUSE, I found that our colord 1.4.6 build output varied depending on the CPU model.

I collected sets from these runs: https://rb.zq1.de/other/colord-cpu-dependent-output.tar.gz

The 2 one was built on an avx-enabled CPU and the 1 one on a kvm64 model without fancy new instructions.

Such diffs can happen when memcpy or strcpy is used on overlapping memory regions. In that case, one of these result files would contain data corruptions. It can also happen if certain floating-point instructions are used that round differently.

Reproducer

# on old CPU:
cd /home/abuild/rpmbuild/BUILD/colord-1.4.6/x86_64-suse-linux && client/cd-create-profile --output=data/profiles/FOGRA28L_webcoated.icc data/profiles/FOGRA28L_webcoated.iccprofile.xml && md5sum data/profiles/FOGRA28L_webcoated.icc data/profiles/FOGRA28L_webcoated.iccprofile.xml
307d9fb02f1c53d31fb0edc790536b89  data/profiles/FOGRA28L_webcoated.icc
4afa623cc7b59915edec432c87b732b0  data/profiles/FOGRA28L_webcoated.iccprofile.xml

# on new CPU
cd /home/abuild/rpmbuild/BUILD/colord-1.4.6/x86_64-suse-linux && client/cd-create-profile --output=data/profiles/FOGRA28L_webcoated.icc data/profiles/FOGRA28L_webcoated.iccprofile.xml && md5sum data/profiles/FOGRA28L_webcoated.icc data/profiles/FOGRA28L_webcoated.iccprofile.xml
0a20906f8db699cc17b166729570a26c  data/profiles/FOGRA28L_webcoated.icc
4afa623cc7b59915edec432c87b732b0  data/profiles/FOGRA28L_webcoated.iccprofile.xml

both of these are stable, so there is no other randomness involved.

I tested that colord-1.4.7 still has the same issue.

hughsie commented 1 month ago

I guess it's something deep in lcms2 doing asm optimisations based on CPU features. I think the lcms2 mailing list might be a good place to ask if this is bug there.

agalakhov commented 1 month ago

On Tue, 08 Oct 2024 00:59:46 -0700 "Bernhard M. Wiedemann" @.***> wrote:

Such diffs can happen when memcpy or strcpy is used on overlapping memory regions. In that case, one of these result files would contain data corruptions. It can also happen if certain floating-point instructions are used that round differently.

Most likely it's rounding combined with compiler optimization. In GCC, float usage may cause that. Even in strict C standard mode there is much freedom for how float actually behaves. (AFAIK, the only language standard that defines it really strictly is Fortran.) Modern FPU hardware often works faster with double, so GCC actually emits double-precision instructions while compiling formally single-precision C code. In contrast, most vector instructions are actually single-precision due to limited size of registers. Switching on things like AVX may actyally switch between double and float.

There is a GCC option that prevents emitting double-precision for float variables. Certain algorithms (i.e. Kahan–Babuška) don't work properly unless this option is set. Look at -fexcess-precision, -fassociative-math and others.

bmwiedemann commented 1 month ago

I tried to build both liblcms2 and colord with -fexcess-precision=standard -fno-associative-math -Wuninitialized -Wmaybe-uninitialized -Wmissing-field-initializers but there are still diffs left. Could there be active CPU-detection in the lcms2 runtime?

hughsie commented 1 month ago

Could there be active CPU-detection in the lcms2 runtime?

I'd say that's quite likely, there's some hand-tuned transform code IIRC.

bmwiedemann commented 1 month ago

I tried to build lcms2 with -DCMS_DONT_USE_SSE2=1 -DCMS_DONT_USE_FAST_FLOOR=1 but that did not help either.

bmwiedemann commented 5 days ago

The discussion in https://github.com/mm2/Little-CMS/issues/465 suggests that it might not be in lcms2. I could not find hints in argyllcms either.

Here is a diff of iccToXml output from colord-1.4.6 variations: https://rb.zq1.de/other/colord/FOGRA27L_coated.icc.xml.diff.txt

How/where are those wtpt and bkpt values computed?

bmwiedemann commented 5 days ago

I bisected with qemu's -cpu param and found that all the difference comes from AVX2 (that first appeared in Intel Haswell in 2013)

With my reproducibleopensuse tools I do cputype=Haswell cputype2=Haswell,avx2=off rbk