Closed vszakats closed 6 months ago
What's happening is that the CFLAGS
option makes the compiler define __CET__
, which makes LibreSSL #include cet.h
, which in turn sets the macro _CET_ENDBR
to endbr32
, and LibreSSL uses that macro now where it used a hard-coded endbr32
previously (in 3.8.1), in this case effectively ending up with the same code that caused the crash in 3.8.1. Then crash on initialization. This is also true for LibreSSL's openssl.exe
.
edit: correction: in 3.8.1 the CPU instruction causing the crash was endbr64
(not endbr32
).
Changing #if defined(__CET__)
to #if defined(__CET__) && !defined(__MINGW32__)
fixes it.
A better fix is to exclude the __CET__
logic from *mingw64*.S
ASM sources (15 files in total).
Regression from: https://github.com/libressl/openbsd/commit/c10c5b524e3121f42c9239d737dd7f975638f378
This patch over the LibreSSL release tarball 3.8.3 (should also apply to 3.9.0) fixes the problem: https://github.com/curl/curl-for-win/blob/acd332ff504bacb6be97331af70abc4d7318ef1b/libressl.patch
@vszakats
Is that here the reason that curl 8.6.0_5
is not working at all for me (Win x64)?
@Hrxn: Yes, it's because of this issue.
We should add this flag option to the CI tests as well. Thanks for the report.
I'm curious about one thing though; previous releases used a hard-coded endbr64, but it sounds like you're saying that mingw-w64 resolves _CET_ENDBR to endbr32?
@busterb: Yes, that's what's happening:
t.S:
#if defined(__CET__)
#include <cet.h>
#else
#define _CET_ENDBR
#endif
_CET_ENDBR
llvm-mingw 20240308 with LLVM 18.1.1:
$ x86_64-w64-mingw32-clang -E -fcf-protection=full t.S
# 1 "t.S"
# 1 "<built-in>" 1
# 1 "<built-in>" 3
# 404 "<built-in>" 3
# 1 "<command line>" 1
# 1 "<built-in>" 2
# 1 "t.S" 2
# 1 "/path/to/llvm-mingw/lib/clang/18/include/cet.h" 1 3
# 44 "/path/to/llvm-mingw/lib/clang/18/include/cet.h" 3
.pushsection ".note.gnu.property", "a"
.p2align 2
.long 1f - 0f
.long 4f - 1f
.long 5
0:
.asciz "GNU"
1:
.p2align 2
.long 0xc0000002
.long 3f - 2f
2:
.long 3
3:
.p2align 2
4:
.popsection
# 3 "t.S" 2
endbr32
gcc:
$ x86_64-w64-mingw32-gcc -E -fcf-protection=full t.S
# 0 "t.S"
# 0 "<built-in>"
# 0 "<command-line>"
# 1 "t.S"
# 1 "/usr/local/Cellar/mingw-w64/11.0.1/toolchain-x86_64/lib/gcc/x86_64-w64-mingw32/13.2.0/include/cet.h" 1 3 4
# 3 "t.S" 2
endbr64
llvm cet.h
:
https://github.com/llvm/llvm-project/blob/llvmorg-18.1.1/clang/lib/Headers/cet.h#L15-L35
gcc cet.h
:
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/i386/cet.h;hb=releases/gcc-13.2.0#l33
Smells like that LP64 vs LLP64 again. LLVM conditions on #ifdef __LP64__
and I would expect it should be doing #if defined(__LP64__) || defined(__LLP64__)
instead. Not sure if there is an easy way of fixing this mess up in our compat cet.h
.
Oops, I missed that in 3.8.1 we had endbr64
, but endbr32
in 3.8.3. Yet, same crash with both.
Made a test and #if defined(__LP64__) || defined(__LLP64__)
doesn't fix llvm to emit endbr64
because the compiler defines neither of these symbols for mingw-w64 x64 targets.
Made another test with _CET_ENDBR
set to endbr64
by patching cet.h
. This would normally fail (see 3.8.1), but it worked due to crypto/CMakeLists.txt
having the line add_definitions(-Dendbr64=)
.
This was a little unexpected, but extending this hack could also zap the mis-emitted endbr32
:
--- a/crypto/CMakeLists.txt
+++ b/crypto/CMakeLists.txt
@@ -194,6 +194,7 @@
cpuid-mingw64-x86_64.S
)
add_definitions(-Dendbr64=)
+ add_definitions(-Dendbr32=) # zap opcode emitted by llvm cet.h
add_definitions(-DAES_ASM)
add_definitions(-DBSAES_ASM)
add_definitions(-DVPAES_ASM)
Will not win a beauty contest.
Did your patching of cet.h
also set __PROPERTY_ALIGN
to 3
?
Just in an isolated test, while trying to figure out why endbr64
is disappearing from the generated code.
Would that make a difference considering -Dendbr64=
?
As a blind shot I patched cet.h
to emit endbr64
and __PROPERTY_ALIGN=3
, and dropped -Dendbr64=
from the build script. The resulting binary (curl.exe
) also crashed right upon start with "unhandled page fault" (under WINE), but as a first time in my tests, this was followed by a stack dump.
So, at least this combination does not make ASM code work.
What's curious though is that all C generated code use this opcode no problem:
0000000000000000 <OPENSSL_init_crypto>:
0: f3 0f 1e fa endbr64
4: 56 push %rsi
5: 57 push %rdi
[...]
Then noticed that endbr64
is also pushed into the .ctors
section.
Deleting it, magically fixed the startup crash and makes curl.exe
run with no issues, with endbr64
in prologues:
--- a/crypto/CMakeLists.txt
+++ b/crypto/CMakeLists.txt
@@ -193,7 +193,6 @@
whrlpool/wp-mingw64-x86_64.S
cpuid-mingw64-x86_64.S
)
- add_definitions(-Dendbr64=)
add_definitions(-DAES_ASM)
add_definitions(-DBSAES_ASM)
add_definitions(-DVPAES_ASM)
--- a/crypto/cpuid-mingw64-x86_64.S
+++ b/crypto/cpuid-mingw64-x86_64.S
@@ -8,8 +8,6 @@
.section .ctors
-_CET_ENDBR
- .p2align 3
.quad OPENSSL_cpuid_setup
(works both with and without .p2align 3
there)
Then I repeated the build with a factory cet.h
and curl.exe
works just as good (with endbr32
and property align as it is.)
This suggests that the opcode present in the constructor stub might be the root cause.
It'd also be interesting to verify this in MASM code (at .CRT$XCU
), and reading up about reasons.
edit: IIRC .ctors
is supposed to be a list of pointers to constructor functions, not actual CPU code, same for MSVC I think. That explains the crash, happening when the startup code tries to call into the opcode as an address.
edit: test build and binaries with the above patch in place: https://github.com/curl/curl-for-win/actions/runs/8358541620/job/22880062249
edit: For the apparently unrelated CET issues endbr32
and __PROPERTY_ALIGN
on Windows, the local cet.h
wrapper would help indeed.
Opened PR to fix ENDBR slipping into init sections: https://github.com/libressl/openbsd/pull/149
With that fixed, this patch fixes the llvm cet.h
endbr32
mis-use:
--- a/crypto/CMakeLists.txt
+++ b/crypto/CMakeLists.txt
@@ -193,7 +193,7 @@
whrlpool/wp-mingw64-x86_64.S
cpuid-mingw64-x86_64.S
)
- add_definitions(-Dendbr64=)
+ add_definitions(-Dendbr32=endbr64)
add_definitions(-DAES_ASM)
add_definitions(-DBSAES_ASM)
add_definitions(-DVPAES_ASM)
Opened #1032 for that.
With both patches, CET will be enabled in ASM functions with MinGW.
I want to make sure we are testing for this case in the future. In my environment, configuring like so LDFLAGS="-Wl,-Xlink=-cetcompat" CFLAGS="-fcf-protection=full" ./configure --host=x86_64-w64-mingw32
yields this linker error:
configure:3851: x86_64-w64-mingw32-gcc -fcf-protection=full -Wl,-Xlink=-cetcompat conftest.c >&5
/usr/bin/x86_64-w64-mingw32-ld: Error: unable to disambiguate: -Xlink=-cetcompat (did you mean --Xlink=-cetcompat ?)
$ x86_64-w64-mingw32-gcc -v
Using built-in specs.
COLLECT_GCC=x86_64-w64-mingw32-gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-w64-mingw32/13-win32/lto-wrapper
Target: x86_64-w64-mingw32
Configured with: ../../src/configure --build=x86_64-linux-gnu --prefix=/usr --includedir='/usr/include' --mandir='/usr/share/man' --infodir='/usr/share/info' --sysconfdir=/etc --localstatedir=/var --disable-option-checking --disable-silent-rules --libdir='/usr/lib/x86_64-linux-gnu' --libexecdir='/usr/lib/x86_64-linux-gnu' --disable-maintainer-mode --disable-dependency-tracking --prefix=/usr --enable-shared --enable-static --disable-multilib --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --libdir=/usr/lib --enable-libstdcxx-time=yes --with-tune=generic --with-headers --enable-version-specific-runtime-libs --enable-fully-dynamic-string --enable-libgomp --enable-languages=c,c++,fortran,objc,obj-c++,ada --enable-lto --enable-threads=win32 --program-suffix=-win32 --program-prefix=x86_64-w64-mingw32- --target=x86_64-w64-mingw32 --with-as=/usr/bin/x86_64-w64-mingw32-as --with-ld=/usr/bin/x86_64-w64-mingw32-ld --enable-libatomic --enable-libstdcxx-filesystem-ts=yes --enable-dependency-tracking SED=/bin/sed
Thread model: win32
Supported LTO compression algorithms: zlib zstd
gcc version 13-win32 (GCC)
$ x86_64-w64-mingw32-ld -v
GNU ld (GNU Binutils) 2.42
Any more details on the build environment you can provide here?
Here's the live build command that triggers it, for llvm/clang: https://github.com/curl/curl-for-win/actions/runs/8448014961/job/23139331181#step:3:4367
Full command:
cmake -B _x64-win-ucrt-bld -DCMAKE_SYSTEM_NAME=Windows -Wno-dev
-DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_MESSAGE=NEVER
-DCMAKE_INSTALL_PREFIX=/usr -DCMAKE_SYSROOT=/usr/x86_64-w64-mingw32
-DCMAKE_C_COMPILER=clang-17 -DCMAKE_C_COMPILER_TARGET=x86_64-w64-mingw32
-DCMAKE_SYSTEM_PROCESSOR=x86_64-w64-mingw32 -DBUILD_SHARED_LIBS=OFF -DLIBRESSL_APPS=OFF
-DLIBRESSL_TESTS=OFF
'-DCMAKE_C_FLAGS= -Wno-unused-command-line-argument -fcf-protection=full
-fno-unwind-tables -fno-asynchronous-unwind-tables -ffunction-sections -fdata-sections -D_WIN32_WINNT=0x0600
-D_UCRT -ffile-prefix-map=/home/runner/work/curl-for-win/curl-for-win/libressl= -Wa,--noexecstack -DNDEBUG
-DS2N_BN_HIDE_SYMBOLS -Wl,-lucrt -Wl,-Xlink=-cetcompat -Wl,--gc-sections -Wl,--icf=all
-L/usr/lib/gcc/x86_64-w64-mingw32/12-win32 -fuse-ld=lld-17 -Wl,-s -static-libgcc'
Without some things likely irrelevant for this:
cmake -B _bld -DCMAKE_SYSTEM_NAME=Windows -DCMAKE_BUILD_TYPE=Release
-DCMAKE_INSTALL_PREFIX=/usr
-DCMAKE_SYSROOT=/usr/x86_64-w64-mingw32
-DCMAKE_C_COMPILER=clang-17
-DCMAKE_C_COMPILER_TARGET=x86_64-w64-mingw32
-DCMAKE_SYSTEM_PROCESSOR=x86_64-w64-mingw32
'-DCMAKE_C_FLAGS= -fcf-protection=full -Wl,-Xlink=-cetcompat -Wl,--icf=all
-L/usr/lib/gcc/x86_64-w64-mingw32/12-win32 -fuse-ld=lld-17'
for gcc: https://github.com/curl/curl-for-win/actions/runs/8448014961/job/23139330995#step:3:4227
Full command:
cmake -B _x64-win-ucrt-bld -DCMAKE_SYSTEM_NAME=Windows -Wno-dev -DCMAKE_BUILD_TYPE=Release
-DCMAKE_INSTALL_MESSAGE=NEVER -DCMAKE_INSTALL_PREFIX=/usr
-DCMAKE_C_COMPILER=x86_64-w64-mingw32-gcc -DCMAKE_C_COMPILER_TARGET=x86_64-w64-mingw32
-DCMAKE_SYSTEM_PROCESSOR=x86_64-w64-mingw32 -DBUILD_SHARED_LIBS=OFF -DLIBRESSL_APPS=OFF
-DLIBRESSL_TESTS=OFF
'-DCMAKE_C_FLAGS= -m64 -fcf-protection=full -fno-unwind-tables
-fno-asynchronous-unwind-tables -ffunction-sections -fdata-sections -fno-ident -D_WIN32_WINNT=0x0600
-D_UCRT -ffile-prefix-map=/home/runner/work/curl-for-win/curl-for-win/libressl= -Wno-attributes -DNDEBUG
-DS2N_BN_HIDE_SYMBOLS -m64 -Wl,-lucrt -specs=/home/runner/work/curl-for-win/curl-for-win/gcc-specs-ucrt
-Wl,--gc-sections -static-libgcc'
Reduced:
cmake -B _bld -DCMAKE_SYSTEM_NAME=Windows -DCMAKE_BUILD_TYPE=Release
-DCMAKE_INSTALL_PREFIX=/usr
-DCMAKE_C_COMPILER=x86_64-w64-mingw32-gcc
-DCMAKE_C_COMPILER_TARGET=x86_64-w64-mingw32
-DCMAKE_SYSTEM_PROCESSOR=x86_64-w64-mingw32
'-DCMAKE_C_FLAGS= -m64 -fcf-protection=full'
-Wl,-Xlink=-cetcompat
is for llvm/clang only.
Issue #907 returned with 3.8.3 and 3.9.0.
Building with CET enabled with mingw-w64 work fine with 3.8.2:
CFLAGS=-fcf-protection=full
LDFLAGS=-Wl,-Xlink=-cetcompat
With 3.8.3, it's crashing the same way as 3.8.1 did.
curl.exe exhibiting the issue: https://curl.se/windows/dl-8.6.0_5/curl-8.6.0_5-win64-mingw.zip