libressl / portable

LibreSSL Portable itself. This includes the build scaffold and compatibility layer that builds portable LibreSSL from the OpenBSD source code. Pull requests or patches sent to tech@openbsd.org are welcome.
https://www.libressl.org
1.36k stars 267 forks source link

3.8.3, 3.9.0 mingw-w64 x86_64 clang ASM CET builds crash on startup #1015

Closed vszakats closed 6 months ago

vszakats commented 7 months ago

Issue #907 returned with 3.8.3 and 3.9.0.

Building with CET enabled with mingw-w64 work fine with 3.8.2:

CFLAGS=-fcf-protection=full LDFLAGS=-Wl,-Xlink=-cetcompat

With 3.8.3, it's crashing the same way as 3.8.1 did.

curl.exe exhibiting the issue: https://curl.se/windows/dl-8.6.0_5/curl-8.6.0_5-win64-mingw.zip

vszakats commented 7 months ago

What's happening is that the CFLAGS option makes the compiler define __CET__, which makes LibreSSL #include cet.h, which in turn sets the macro _CET_ENDBR to endbr32, and LibreSSL uses that macro now where it used a hard-coded endbr32 previously (in 3.8.1), in this case effectively ending up with the same code that caused the crash in 3.8.1. Then crash on initialization. This is also true for LibreSSL's openssl.exe.

edit: correction: in 3.8.1 the CPU instruction causing the crash was endbr64 (not endbr32).

Changing #if defined(__CET__) to #if defined(__CET__) && !defined(__MINGW32__) fixes it.

A better fix is to exclude the __CET__ logic from *mingw64*.S ASM sources (15 files in total).

Regression from: https://github.com/libressl/openbsd/commit/c10c5b524e3121f42c9239d737dd7f975638f378

This patch over the LibreSSL release tarball 3.8.3 (should also apply to 3.9.0) fixes the problem: https://github.com/curl/curl-for-win/blob/acd332ff504bacb6be97331af70abc4d7318ef1b/libressl.patch

Hrxn commented 7 months ago

@vszakats

Is that here the reason that curl 8.6.0_5 is not working at all for me (Win x64)?

vszakats commented 7 months ago

@Hrxn: Yes, it's because of this issue.

busterb commented 7 months ago

We should add this flag option to the CI tests as well. Thanks for the report.

I'm curious about one thing though; previous releases used a hard-coded endbr64, but it sounds like you're saying that mingw-w64 resolves _CET_ENDBR to endbr32?

vszakats commented 7 months ago

@busterb: Yes, that's what's happening:

t.S:

#if defined(__CET__)
#include <cet.h>
#else
#define _CET_ENDBR
#endif

_CET_ENDBR

llvm-mingw 20240308 with LLVM 18.1.1:

$ x86_64-w64-mingw32-clang -E -fcf-protection=full t.S

# 1 "t.S"
# 1 "<built-in>" 1
# 1 "<built-in>" 3
# 404 "<built-in>" 3
# 1 "<command line>" 1
# 1 "<built-in>" 2
# 1 "t.S" 2

# 1 "/path/to/llvm-mingw/lib/clang/18/include/cet.h" 1 3
# 44 "/path/to/llvm-mingw/lib/clang/18/include/cet.h" 3
 .pushsection ".note.gnu.property", "a"
 .p2align 2
 .long 1f - 0f
 .long 4f - 1f

 .long 5
0:                                                                               
 .asciz "GNU"                                                                    
1:                                                                               
 .p2align 2                                                                      

 .long 0xc0000002                                                                
 .long 3f - 2f                                                                                    
2:                                                                                                

 .long 3                                                                                          
3:                                                                                                
 .p2align 2                                                                                       
4:                                                                                                
 .popsection                                                                                        
# 3 "t.S" 2

endbr32

gcc:

$ x86_64-w64-mingw32-gcc -E -fcf-protection=full t.S

# 0 "t.S"
# 0 "<built-in>"
# 0 "<command-line>"
# 1 "t.S"

# 1 "/usr/local/Cellar/mingw-w64/11.0.1/toolchain-x86_64/lib/gcc/x86_64-w64-mingw32/13.2.0/include/cet.h" 1 3 4
# 3 "t.S" 2

endbr64

llvm cet.h: https://github.com/llvm/llvm-project/blob/llvmorg-18.1.1/clang/lib/Headers/cet.h#L15-L35

gcc cet.h: https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/i386/cet.h;hb=releases/gcc-13.2.0#l33

botovq commented 7 months ago

Smells like that LP64 vs LLP64 again. LLVM conditions on #ifdef __LP64__ and I would expect it should be doing #if defined(__LP64__) || defined(__LLP64__) instead. Not sure if there is an easy way of fixing this mess up in our compat cet.h.

vszakats commented 7 months ago

Oops, I missed that in 3.8.1 we had endbr64, but endbr32 in 3.8.3. Yet, same crash with both.

Made a test and #if defined(__LP64__) || defined(__LLP64__) doesn't fix llvm to emit endbr64 because the compiler defines neither of these symbols for mingw-w64 x64 targets.

Made another test with _CET_ENDBR set to endbr64 by patching cet.h. This would normally fail (see 3.8.1), but it worked due to crypto/CMakeLists.txt having the line add_definitions(-Dendbr64=).

This was a little unexpected, but extending this hack could also zap the mis-emitted endbr32:

--- a/crypto/CMakeLists.txt
+++ b/crypto/CMakeLists.txt
@@ -194,6 +194,7 @@
        cpuid-mingw64-x86_64.S
    )
    add_definitions(-Dendbr64=)
+   add_definitions(-Dendbr32=)  # zap opcode emitted by llvm cet.h
    add_definitions(-DAES_ASM)
    add_definitions(-DBSAES_ASM)
    add_definitions(-DVPAES_ASM)

Will not win a beauty contest.

botovq commented 7 months ago

Did your patching of cet.h also set __PROPERTY_ALIGN to 3?

vszakats commented 7 months ago

Just in an isolated test, while trying to figure out why endbr64 is disappearing from the generated code. Would that make a difference considering -Dendbr64=?

vszakats commented 7 months ago

As a blind shot I patched cet.h to emit endbr64 and __PROPERTY_ALIGN=3, and dropped -Dendbr64= from the build script. The resulting binary (curl.exe) also crashed right upon start with "unhandled page fault" (under WINE), but as a first time in my tests, this was followed by a stack dump.

So, at least this combination does not make ASM code work.

What's curious though is that all C generated code use this opcode no problem:

0000000000000000 <OPENSSL_init_crypto>:
   0:   f3 0f 1e fa             endbr64
   4:   56                      push   %rsi
   5:   57                      push   %rdi
[...]

Then noticed that endbr64 is also pushed into the .ctors section.

Deleting it, magically fixed the startup crash and makes curl.exe run with no issues, with endbr64 in prologues:

--- a/crypto/CMakeLists.txt
+++ b/crypto/CMakeLists.txt
@@ -193,7 +193,6 @@
        whrlpool/wp-mingw64-x86_64.S
        cpuid-mingw64-x86_64.S
    )
-   add_definitions(-Dendbr64=)
    add_definitions(-DAES_ASM)
    add_definitions(-DBSAES_ASM)
    add_definitions(-DVPAES_ASM)
--- a/crypto/cpuid-mingw64-x86_64.S
+++ b/crypto/cpuid-mingw64-x86_64.S
@@ -8,8 +8,6 @@

 .section   .ctors
-_CET_ENDBR
-   .p2align    3
    .quad   OPENSSL_cpuid_setup

(works both with and without .p2align 3 there)

Then I repeated the build with a factory cet.h and curl.exe works just as good (with endbr32 and property align as it is.)

This suggests that the opcode present in the constructor stub might be the root cause. It'd also be interesting to verify this in MASM code (at .CRT$XCU), and reading up about reasons.

edit: IIRC .ctors is supposed to be a list of pointers to constructor functions, not actual CPU code, same for MSVC I think. That explains the crash, happening when the startup code tries to call into the opcode as an address.

edit: test build and binaries with the above patch in place: https://github.com/curl/curl-for-win/actions/runs/8358541620/job/22880062249

edit: For the apparently unrelated CET issues endbr32 and __PROPERTY_ALIGN on Windows, the local cet.h wrapper would help indeed.

vszakats commented 7 months ago

Opened PR to fix ENDBR slipping into init sections: https://github.com/libressl/openbsd/pull/149

With that fixed, this patch fixes the llvm cet.h endbr32 mis-use:

--- a/crypto/CMakeLists.txt
+++ b/crypto/CMakeLists.txt
@@ -193,7 +193,7 @@
        whrlpool/wp-mingw64-x86_64.S
        cpuid-mingw64-x86_64.S
    )
-   add_definitions(-Dendbr64=)
+   add_definitions(-Dendbr32=endbr64)
    add_definitions(-DAES_ASM)
    add_definitions(-DBSAES_ASM)
    add_definitions(-DVPAES_ASM)

Opened #1032 for that.

With both patches, CET will be enabled in ASM functions with MinGW.

busterb commented 7 months ago

I want to make sure we are testing for this case in the future. In my environment, configuring like so LDFLAGS="-Wl,-Xlink=-cetcompat" CFLAGS="-fcf-protection=full" ./configure --host=x86_64-w64-mingw32 yields this linker error:

configure:3851: x86_64-w64-mingw32-gcc -fcf-protection=full  -Wl,-Xlink=-cetcompat conftest.c  >&5
/usr/bin/x86_64-w64-mingw32-ld: Error: unable to disambiguate: -Xlink=-cetcompat (did you mean --Xlink=-cetcompat ?)
$ x86_64-w64-mingw32-gcc -v
Using built-in specs.
COLLECT_GCC=x86_64-w64-mingw32-gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-w64-mingw32/13-win32/lto-wrapper
Target: x86_64-w64-mingw32
Configured with: ../../src/configure --build=x86_64-linux-gnu --prefix=/usr --includedir='/usr/include' --mandir='/usr/share/man' --infodir='/usr/share/info' --sysconfdir=/etc --localstatedir=/var --disable-option-checking --disable-silent-rules --libdir='/usr/lib/x86_64-linux-gnu' --libexecdir='/usr/lib/x86_64-linux-gnu' --disable-maintainer-mode --disable-dependency-tracking --prefix=/usr --enable-shared --enable-static --disable-multilib --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --libdir=/usr/lib --enable-libstdcxx-time=yes --with-tune=generic --with-headers --enable-version-specific-runtime-libs --enable-fully-dynamic-string --enable-libgomp --enable-languages=c,c++,fortran,objc,obj-c++,ada --enable-lto --enable-threads=win32 --program-suffix=-win32 --program-prefix=x86_64-w64-mingw32- --target=x86_64-w64-mingw32 --with-as=/usr/bin/x86_64-w64-mingw32-as --with-ld=/usr/bin/x86_64-w64-mingw32-ld --enable-libatomic --enable-libstdcxx-filesystem-ts=yes --enable-dependency-tracking SED=/bin/sed
Thread model: win32
Supported LTO compression algorithms: zlib zstd
gcc version 13-win32 (GCC)

$ x86_64-w64-mingw32-ld -v
GNU ld (GNU Binutils) 2.42

Any more details on the build environment you can provide here?

vszakats commented 7 months ago

Here's the live build command that triggers it, for llvm/clang: https://github.com/curl/curl-for-win/actions/runs/8448014961/job/23139331181#step:3:4367

Full command:

cmake -B _x64-win-ucrt-bld -DCMAKE_SYSTEM_NAME=Windows -Wno-dev 
-DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_MESSAGE=NEVER 
-DCMAKE_INSTALL_PREFIX=/usr -DCMAKE_SYSROOT=/usr/x86_64-w64-mingw32 
-DCMAKE_C_COMPILER=clang-17 -DCMAKE_C_COMPILER_TARGET=x86_64-w64-mingw32 
-DCMAKE_SYSTEM_PROCESSOR=x86_64-w64-mingw32 -DBUILD_SHARED_LIBS=OFF -DLIBRESSL_APPS=OFF 
-DLIBRESSL_TESTS=OFF
'-DCMAKE_C_FLAGS= -Wno-unused-command-line-argument  -fcf-protection=full 
-fno-unwind-tables -fno-asynchronous-unwind-tables -ffunction-sections -fdata-sections  -D_WIN32_WINNT=0x0600 
-D_UCRT -ffile-prefix-map=/home/runner/work/curl-for-win/curl-for-win/libressl= -Wa,--noexecstack  -DNDEBUG 
-DS2N_BN_HIDE_SYMBOLS  -Wl,-lucrt -Wl,-Xlink=-cetcompat -Wl,--gc-sections -Wl,--icf=all 
-L/usr/lib/gcc/x86_64-w64-mingw32/12-win32 -fuse-ld=lld-17 -Wl,-s -static-libgcc'

Without some things likely irrelevant for this:

cmake -B _bld -DCMAKE_SYSTEM_NAME=Windows -DCMAKE_BUILD_TYPE=Release 
-DCMAKE_INSTALL_PREFIX=/usr 
-DCMAKE_SYSROOT=/usr/x86_64-w64-mingw32
-DCMAKE_C_COMPILER=clang-17 
-DCMAKE_C_COMPILER_TARGET=x86_64-w64-mingw32
-DCMAKE_SYSTEM_PROCESSOR=x86_64-w64-mingw32 
'-DCMAKE_C_FLAGS= -fcf-protection=full -Wl,-Xlink=-cetcompat -Wl,--icf=all 
-L/usr/lib/gcc/x86_64-w64-mingw32/12-win32 -fuse-ld=lld-17'

for gcc: https://github.com/curl/curl-for-win/actions/runs/8448014961/job/23139330995#step:3:4227

Full command:

cmake -B _x64-win-ucrt-bld -DCMAKE_SYSTEM_NAME=Windows -Wno-dev -DCMAKE_BUILD_TYPE=Release 
-DCMAKE_INSTALL_MESSAGE=NEVER -DCMAKE_INSTALL_PREFIX=/usr 
-DCMAKE_C_COMPILER=x86_64-w64-mingw32-gcc -DCMAKE_C_COMPILER_TARGET=x86_64-w64-mingw32 
-DCMAKE_SYSTEM_PROCESSOR=x86_64-w64-mingw32 -DBUILD_SHARED_LIBS=OFF -DLIBRESSL_APPS=OFF 
-DLIBRESSL_TESTS=OFF
'-DCMAKE_C_FLAGS= -m64  -fcf-protection=full -fno-unwind-tables 
-fno-asynchronous-unwind-tables -ffunction-sections -fdata-sections -fno-ident  -D_WIN32_WINNT=0x0600 
-D_UCRT -ffile-prefix-map=/home/runner/work/curl-for-win/curl-for-win/libressl= -Wno-attributes  -DNDEBUG 
-DS2N_BN_HIDE_SYMBOLS -m64  -Wl,-lucrt -specs=/home/runner/work/curl-for-win/curl-for-win/gcc-specs-ucrt 
-Wl,--gc-sections -static-libgcc'

Reduced:

cmake -B _bld -DCMAKE_SYSTEM_NAME=Windows -DCMAKE_BUILD_TYPE=Release 
-DCMAKE_INSTALL_PREFIX=/usr 
-DCMAKE_C_COMPILER=x86_64-w64-mingw32-gcc
-DCMAKE_C_COMPILER_TARGET=x86_64-w64-mingw32 
-DCMAKE_SYSTEM_PROCESSOR=x86_64-w64-mingw32
'-DCMAKE_C_FLAGS= -m64 -fcf-protection=full'

-Wl,-Xlink=-cetcompat is for llvm/clang only.