VNNI support seems to require something later than GCC 11.1

musicinmybrain commented 6 months ago

I’m trying to update the EPEL9 package for libdeflate to 1.20. On x86_64, it fails to build with errors like:

[ 28%] Building C object CMakeFiles/libdeflate_shared.dir/lib/gzip_decompress.c.o
/usr/bin/gcc -DLIBDEFLATE_DLL -Dlibdeflate_shared_EXPORTS -I/builddir/build/BUILD/libdeflate-1.20 -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64-v2 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -O2 -DNDEBUG -fPIC -fvisibility=hidden -Wall -Wdeclaration-after-statement -Wimplicit-fallthrough -Wmissing-field-initializers -Wmissing-prototypes -Wpedantic -Wshadow -Wstrict-prototypes -Wundef -Wvla -std=gnu99 -MD -MT CMakeFiles/libdeflate_shared.dir/lib/gzip_compress.c.o -MF CMakeFiles/libdeflate_shared.dir/lib/gzip_compress.c.o.d -o CMakeFiles/libdeflate_shared.dir/lib/gzip_compress.c.o -c /builddir/build/BUILD/libdeflate-1.20/lib/gzip_compress.c
/usr/bin/gcc -DLIBDEFLATE_DLL -Dlibdeflate_shared_EXPORTS -I/builddir/build/BUILD/libdeflate-1.20 -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64 -march=x86-64-v2 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -O2 -DNDEBUG -fPIC -fvisibility=hidden -Wall -Wdeclaration-after-statement -Wimplicit-fallthrough -Wmissing-field-initializers -Wmissing-prototypes -Wpedantic -Wshadow -Wstrict-prototypes -Wundef -Wvla -std=gnu99 -MD -MT CMakeFiles/libdeflate_shared.dir/lib/gzip_decompress.c.o -MF CMakeFiles/libdeflate_shared.dir/lib/gzip_decompress.c.o.d -o CMakeFiles/libdeflate_shared.dir/lib/gzip_decompress.c.o -c /builddir/build/BUILD/libdeflate-1.20/lib/gzip_decompress.c
{standard input}: Assembler messages:
{standard input}:8752: Error: unsupported instruction `vpdpbusd'
{standard input}:8757: Error: unsupported instruction `vpdpbusd'
{standard input}:8773: Error: unsupported instruction `vpdpbusd'
{standard input}:8779: Error: unsupported instruction `vpdpbusd'
{standard input}:8809: Error: unsupported instruction `vpdpbusd'
{standard input}:8995: Error: unsupported instruction `vpdpbusd'
{standard input}:9000: Error: unsupported instruction `vpdpbusd'
{standard input}:9450: Error: unsupported instruction `vpdpbusd'
{standard input}:9457: Error: unsupported instruction `vpdpbusd'
{standard input}:9464: Error: unsupported instruction `vpdpbusd'
{standard input}:9475: Error: unsupported instruction `vpdpbusd'
{standard input}:9486: Error: unsupported instruction `vpdpbusd'
{standard input}:9498: Error: unsupported instruction `vpdpbusd'
{standard input}:9564: Error: unsupported instruction `vpdpbusd'
{standard input}:9576: Error: unsupported instruction `vpdpbusd'
{standard input}:9871: Error: unsupported instruction `vpdpbusd'
{standard input}:9879: Error: unsupported instruction `vpdpbusd'
gmake[2]: *** [CMakeFiles/libdeflate_shared.dir/build.make:149: CMakeFiles/libdeflate_shared.dir/lib/adler32.c.o] Error 1

I see that the VNNI implementation is gated to build only on GCC 11.1 or later in

https://github.com/ebiggers/libdeflate/blob/275aa5141db6eda3587214e0f1d3a134768f557d/lib/x86/adler32_impl.h#L56

but RHEL9 has GCC 11.4.1.

There seems to be a related report here (but I don’t have access to the full report either).

It’s frustratingly hard to find a real citation for when GCC added support for this instruction. Maybe 12.1 instead of 11.1?

ebiggers commented 6 months ago

What is your binutils version? The error is coming from the assembler, not the compiler.

libdeflate aims to keep its source code buildable without a configuration step, and therefore when deciding whether to use intrinsics it only checks the compiler version, not the assembler version.

For the assembler it just assumes that the toolchain has an assembler can assemble all the instructions that the compiler can produce. That is always supposed to be the case, especially considering that support for new instructions regularly get released in binutils before their intrinsics get released in gcc.

If there's a specific popular distro version that's being problematic by pairing an older binutils with a newer gcc, I can consider bumping up the gcc version check to exclude that. Though it's annoying having to work around broken toolchains like this.

ebiggers commented 6 months ago

It looks like AVX-VNNI support was released in binutils 2.36, which was released after the gcc support (in 11.1). That's unusual for a new CPU feature. "CentOS Stream 9" has binutils 2.35.2, which I assume is the same as RHEL9.

I guess I'll bump the gcc prerequisite for AVX-VNNI up to 12.1 as a workaround for the slow binutils release.

musicinmybrain commented 6 months ago

It looks like AVX-VNNI support was released in binutils 2.36, which was released after the gcc support (in 11.1). That's unusual for a new CPU feature. "CentOS Stream 9" has binutils 2.35.2, which I assume is the same as RHEL9.

Yes, that matches what I see. Thank you for looking into this.

I guess I'll bump the gcc prerequisite for AVX-VNNI up to 12.1 as a workaround for the slow binutils release.

Seems logical. Thank you for following up, and in particular for the detailed comment in 857c97aa07a87bac8c3811b6d4468dfa2f025b4c.

ebiggers / libdeflate

VNNI support seems to require something later than GCC 11.1 #365