wolfSSL / wolfssl

The wolfSSL library is a small, fast, portable implementation of TLS/SSL for embedded devices to the cloud. wolfSSL supports up to TLS 1.3 and DTLS 1.3!
https://www.wolfssl.com
GNU General Public License v2.0
2.36k stars 834 forks source link

[Bug]: wolfssl android build with LTO and clang crashed #6119

Closed calvin2021y closed 3 weeks ago

calvin2021y commented 1 year ago

Contact Details

No response

Version

v5.5.4-stable

Description

build without lto work fine, with lto app start crashed.

here is crashlog.

aarch64, android 10, android-ndk-r25c

Reproduction steps


configure --enable-shared=no --enable-harden --enable-filesystem=no --enable-pwdbased=no --enable-ip-alt-name --enable-sni --enable-alpn --enable-truncatedhmac --enable-earlydata --enable-tlsv10=no --enable-oldtls=yes --enable-tlsv12=yes --enable-tls13 --enable-rsa --enable-psk-one-id --enable-session-ticket --enable-savesession --enable-sessioncerts --enable-rng --enable-aescbc=yes --enable-aescfb=no --enable-aesccm=no --enable-aesctr=no --enable-aesctr=no --enable-maxfragment=yes --enable-blake2=no --enable-blake2s=no --enable-hkdf=no --enable-sys-ca-certs=no --enable-examples=no --enable-crypttests=no --enable-singlethreaded=no --enable-asynccrypt=no --enable-asyncthreads=no --enable-sha384 --enable-asm=yes --enable-sp=small,asm --enable-armasm --enable-bigcache --enable-curl --enable-curve25519=yes --enable-ed25519=yes --enable-crl=no --enable-ocsp --enable-ocspstapling --enable-ocspstapling2 --enable-hrrcookie=no --host=aarch64-linux-android

Relevant log output

Build fingerprint: 'samsung/beyond2qltezh/beyond2q:11/RP1A.200720.012/G9750ZHU5FUE2:user/release-keys'
#00 0x000000000018c74c /data/app/~~Vq9RoGneLeivKLzeL50H9A==/com.test-LLToSYwHbUmtJr12PaFEww==/lib/arm64/libtest.so
sp_256_mont_mul_4
wolfcrypt/src/sp_arm64.c:22418:5
#01 0x000000000018d4f4 /data/app/~~Vq9RoGneLeivKLzeL50H9A==/com.test-LLToSYwHbUmtJr12PaFEww==/lib/arm64/libtest.so
sp_256_proj_point_dbl_n_store_4
wolfcrypt/src/sp_arm64.c:24178:9
sp_256_ecc_mulmod_win_add_sub_4
wolfcrypt/src/sp_arm64.c:24460:9
sp_256_ecc_mulmod_4
wolfcrypt/src/sp_arm64.c:24994:12
#02 0x000000000018cfc0 /data/app/~~Vq9RoGneLeivKLzeL50H9A==/com.test-LLToSYwHbUmtJr12PaFEww==/lib/arm64/libtest.so
sp_ecc_secret_gen_256
wolfcrypt/src/sp_arm64.c:39975:19
#03 0x000000000016b470 /data/app/~~Vq9RoGneLeivKLzeL50H9A==/com.test-LLToSYwHbUmtJr12PaFEww==/lib/arm64/libtest.so
wc_ecc_shared_secret_gen_sync
wolfcrypt/src/ecc.c:4417:15
wc_ecc_shared_secret_ex
wolfcrypt/src/ecc.c:4750:23
#04 0x00000000001e9ef0 /data/app/~~Vq9RoGneLeivKLzeL50H9A==/com.test-LLToSYwHbUmtJr12PaFEww==/lib/arm64/libtest.so
wc_ecc_shared_secret
wolfcrypt/src/ecc.c:4358:10
EccSharedSecret
src/internal.c:5061:19
TLSX_KeyShare_ProcessEcc
src/tls.c:7847:15
TLSX_KeyShare_Process
src/tls.c:8085:15
#05 0x00000000001eff84 /data/app/~~Vq9RoGneLeivKLzeL50H9A==/com.test-LLToSYwHbUmtJr12PaFEww==/lib/arm64/libtest.so
TLSX_KeyShare_Parse
src/tls.c:8310:15
TLSX_Parse
src/tls.c:12421:23
#06 0x00000000001f3fac /data/app/~~Vq9RoGneLeivKLzeL50H9A==/com.test-LLToSYwHbUmtJr12PaFEww==/lib/arm64/libtest.so
DoTls13ServerHello
src/tls13.c:4552:15
#07 0x00000000001f6834 /data/app/~~Vq9RoGneLeivKLzeL50H9A==/com.test-LLToSYwHbUmtJr12PaFEww==/lib/arm64/libtest.so
DoTls13HandShakeMsgType
src/tls13.c:10414:15
#08 0x00000000001d1c48 /data/app/~~Vq9RoGneLeivKLzeL50H9A==/com.test-LLToSYwHbUmtJr12PaFEww==/lib/arm64/libtest.so
DoTls13HandShakeMsg
src/tls13.c:10718:15
ProcessReplyEx
src/internal.c:19714:31
#09 0x000000000015d560 /data/app/~~Vq9RoGneLeivKLzeL50H9A==/com.test-LLToSYwHbUmtJr12PaFEww==/lib/arm64/libtest.so
ProcessReply
src/internal.c:18991:12
wolfSSL_connect
src/ssl.c:12692:36
calvin2021y commented 1 year ago

change --enable-sp=small,asm into --enable-sp=small fix the problem.

embhorn commented 1 year ago

Hi @calvin2021y

Thanks for sharing this report. LTO was probably causing some issues with the assembler optimizations. If the performance is suitable for you without the SP assembly, then I would suggest proceeding with that configuration.

Thanks, @embhorn - wolfSSL Support

calvin2021y commented 1 year ago

hi @embhorn

Thanks for explain. I find one more problem with wolfssl for windows x86 with LTO .

--enable-shared=no --enable-harden --enable-filesystem=no --enable-pwdbased=no --enable-ip-alt-name --enable-sni --enable-alpn --enable-truncatedhmac --enable-earlydata --enable-tlsv10=no --enable-oldtls=yes --enable-tlsv12=yes --enable-tls13 --enable-rsa --enable-psk-one-id --enable-session-ticket --enable-savesession --enable-sessioncerts --enable-rng --enable-aescbc=yes --enable-aescfb=no --enable-aesccm=no --enable-aesctr=no --enable-aesctr=no --enable-maxfragment=yes --enable-blake2=no --enable-blake2s=no --enable-hkdf=no --enable-sys-ca-certs=no --enable-examples=no --enable-crypttests=no --enable-singlethreaded=no --enable-asynccrypt=no --enable-asyncthreads=no --enable-sha384 --enable-asm=no --enable-sp=small --enable-bigcache --enable-curl --enable-curve25519=yes --enable-ed25519=yes --enable-crl=no --enable-ocsp --enable-ocspstapling --enable-ocspstapling2 --enable-hrrcookie=no --host=i686-w64-mingw32

in this case I use --enable-asm=no --enable-sp=small, also try --enable-asm=yes --enable-sp=small, get the same restuls.

(lldb) bt
* thread #1, stop reason = Exception 0xc0000005 encountered at address 0x7568fb: Access violation reading location 0xffffffff
  * frame #0: 0x007568fb test.exe`SetKeysSide [inlined] wc_AesInit(aes=0x305db178, heap=<unavailable>, devId=<unavailable>) at aes.c:10692:17
    frame #1: 0x007568e5 test.exe`SetKeysSide at keys.c:2484:17
    frame #2: 0x00755ff1 test.exe`SetKeysSide(ssl=0x305c0d78, side=<unavailable>) at keys.c:2983:15
    frame #3: 0x006fbb9e test.exe`ProcessReplyEx(ssl=<unavailable>, allowSocketErr=0) at internal.c:19872:32
    frame #4: 0x00683021 test.exe`wolfSSL_connect [inlined] ProcessReply(ssl=0x305c0d78) at internal.c:18991:12
    frame #5: 0x0068301a test.exe`wolfSSL_connect(ssl=0x305c0d78) at ssl.c:12890:36
    frame #6: 0x00726c4c test.exe`wolfssl_connect_common [inlined] wolfssl_connect_step2(cf=<unavailable>, data=<unavailable>) at wolfssl.c:711:9
    frame #7: 0x00726afb test.exe`wolfssl_connect_common(cf=<unavailable>, data=<unavailable>, nonblocking=<unavailable>, done=0x2e0fefab) at wolfssl.c:1196:14
    frame #8: 0x00692aa3 test.exe`ssl_cf_connect [inlined] wolfssl_connect_nonblocking(cf=<unavailable>, data=<unavailable>, done=0x2e0fefab) at wolfssl.c:1228:10
    frame #9: 0x00692a97 test.exe`ssl_cf_connect [inlined] ssl_connect_nonblocking(cf=<unavailable>, data=<unavailable>, done=0x2e0fefab) at vtls.c:348:10
    frame #10: 0x00692a89 test.exe`ssl_cf_connect(cf=<unavailable>, data=<unavailable>, blocking=<unavailable>, done=<unavailable>) at vtls.c:1534:14
    frame #11: 0x006a1aad test.exe`cf_setup_connect [inlined] Curl_conn_cf_connect(cf=<unavailable>, data=<unavailable>, blocking=<unavailable>, done=<unavailable>) at cfilters.c:307:12
    frame #12: 0x006a1a9c test.exe`cf_setup_connect(cf=<unavailable>, data=<unavailable>, blocking=<unavailable>, done=<unavailable>) at connect.c:1164:14
    frame #13: 0x007339e8 test.exe`cf_hc_connect [inlined] Curl_conn_cf_connect(cf=<unavailable>, data=<unavailable>, blocking=false, done=<unavailable>) at cfilters.c:307:12
    frame #14: 0x007339da test.exe`cf_hc_connect [inlined] cf_hc_baller_connect(b=<unavailable>, cf=0x30565158, data=<unavailable>, done=<unavailable>) at cf-http.c:135:15
    frame #15: 0x007339d4 test.exe`cf_hc_connect(cf=0x30565158, data=0x30756c70, blocking=<unavailable>, done=0x2e0fefab) at cf-http.c:288:16
    frame #16: 0x006a0ae2 test.exe`Curl_conn_connect(data=0x30756c70, sockindex=0, blocking=false, done=0x2e0fefab) at cfilters.c:370:14
    frame #17: 0x00677bf7 test.exe`multi_runsingle(multi=<unavailable>, nowp=<unavailable>, data=0x30756c70) at multi.c:2094:16
    frame #18: 0x0067dbf0 test.exe`curl_multi_socket_action [inlined] multi_socket(multi=<unavailable>, checkall=false, s=<unavailable>, ev_bitmask=<unavailable>, running_handles=<unavailable>) at multi.c:3219:
16
    frame #19: 0x0067da70 test.exe`curl_multi_socket_action(multi=<unavailable>, s=<unavailable>, ev_bitmask=1, running_handles=0x2e0ff284) at multi.c:3340:12
    frame #20: 0x005d28cf test.exe`start1 + 63
    frame #21: 0x0066cb55 test.exe`uv__process_reqs at poll.c:190:7
    frame #22: 0x0066cabb test.exe`uv__process_reqs [inlined] uv__process_poll_req(loop=<unavailable>, handle=0x305ab820, req=<unavailable>) at poll.c:531:5
    frame #23: 0x0066cabb test.exe`uv__process_reqs(loop=<unavailable>) at req-inl.h:197:9
    frame #24: 0x0066b4fe test.exe`uv_run(loop=<unavailable>, mode=UV_RUN_DEFAULT) at core.c:623:7
    frame #25: 0x005d54e9 test.exe`start2 + 1641
    frame #26: 0x005d4936 test.exe`_main + 438
    frame #27: 0x005b1393 test.exe`__tmainCRTStartup at crtexe.c:329:15
    frame #28: 0x769700f9 kernel32.dll`BaseThreadInitThunk + 25
    frame #29: 0x77457bbe ntdll.dll`RtlGetAppContainerNamedObjectPath + 286
    frame #30: 0x77457b8e ntdll.dll`RtlGetAppContainerNamedObjectPath + 238

build with openssl + asm + lto for windows x86 work without problem.

x64 work for openssl or wolfssl.

calvin2021y commented 1 year ago

There is one more problem for android x86, same build option like android aarch64.

Curl report:

ALPN: offers h2,http/1.1
SSL_connect failed with error -155: ASN sig error, confirm failure
Closing connection 11

add --enable-smallstack build, same error

android x64, aarch64, arm32 has no problem.

embhorn commented 1 year ago

Hi @SparkiDev

Could you please review this issue?

SparkiDev commented 1 year ago

Hi @calvin2021y

Can you please confirm that the access violation in wc_AesInit() is at the line aes.c:10692:

    aes->aadLen = 0;

If it is, then this is very strange as the variable aes is being checked for NULL and is being accessed before this! It does say the code is inlined so there may be something funny that the compiler is doing. You may want to try lowering the optimization level if possible.

Looking back at the ARM64 crash, again it is an odd place to have an issue. The only likely reason for an issue is alignment. Could you change the lines in sp_arm64.c: 21722-21732 to:

/* Point structure to use. */
typedef struct sp_point_256 {
    /* X ordinate of point. */
    ALIGN16 sp_digit x[2 * 4];
    /* Y ordinate of point. */
    ALIGN16 sp_digit y[2 * 4];
    /* Z ordinate of point. */
    ALIGN16 sp_digit z[2 * 4];
    /* Indicates point is at infinity. */
    int infinity;
} sp_point_256;

This should not help but it might strong indicate to the compiler that the fields need to be aligned. If you could confirm exactly which instructions is causing the crash, it would be much appreciated.

Sean

calvin2021y commented 1 year ago

Thanks for explain, for windows x86 it is the line of : aes->aadLen = 0;

now I will test with this patch:

diff --git a/wolfcrypt/src/aes.c b/wolfcrypt/src/aes.c
index 13e8fbbf6..59a9a8606 100644
--- a/wolfcrypt/src/aes.c
+++ b/wolfcrypt/src/aes.c
@@ -10689,6 +10689,7 @@ int wc_AesInit(Aes* aes, void* heap, int devId)
 #ifdef HAVE_AESGCM
 #ifdef OPENSSL_EXTRA
     XMEMSET(aes->aadH, 0, sizeof(aes->aadH));
+    fprintf( stderr, "aes=%p before aadLen\n", aes);
     aes->aadLen = 0;
 #endif
 #endif
diff --git a/wolfcrypt/src/sp_arm64.c b/wolfcrypt/src/sp_arm64.c
index 9db756ac9..b422c6a6a 100644
--- a/wolfcrypt/src/sp_arm64.c
+++ b/wolfcrypt/src/sp_arm64.c
@@ -21722,11 +21722,11 @@ int sp_DhExp_4096(const mp_int* base, const byte* exp, word32 expLen,
 /* Point structure to use. */
 typedef struct sp_point_256 {
     /* X ordinate of point. */
-    sp_digit x[2 * 4];
+    ALIGN16 sp_digit x[2 * 4];
     /* Y ordinate of point. */
-    sp_digit y[2 * 4];
+    ALIGN16 sp_digit y[2 * 4];
     /* Z ordinate of point. */
-    sp_digit z[2 * 4];
+    ALIGN16 sp_digit z[2 * 4];
     /* Indicates point is at infinity. */
     int infinity;
 } sp_point_256;
calvin2021y commented 1 year ago

With this patch, on aarch64, with --enable-sp=small,asm

Build fingerprint: 'samsung/beyond2qltezh/beyond2q:11/RP1A.200720.012/G9750ZHU5FUE2:user/release-keys'
#00 0x000000000018c04c /data/app/~~KTk0gKq8ur1nMdN21wGDCQ==/com.test-BAnij8kkuaLOwIea5uCL8Q==/lib/arm64/libtest.so
sp_256_mont_sqr_4
wolfcrypt/src/sp_arm64.c:22603:5
#01 0x000000000018ce5c /data/app/~~KTk0gKq8ur1nMdN21wGDCQ==/com.test-BAnij8kkuaLOwIea5uCL8Q==/lib/arm64/libtest.so
sp_256_proj_point_dbl_n_store_4
wolfcrypt/src/sp_arm64.c:24173:9
sp_256_ecc_mulmod_win_add_sub_4
wolfcrypt/src/sp_arm64.c:24460:9
sp_256_ecc_mulmod_4
wolfcrypt/src/sp_arm64.c:24994:12
#02 0x000000000018caec /data/app/~~KTk0gKq8ur1nMdN21wGDCQ==/com.test-BAnij8kkuaLOwIea5uCL8Q==/lib/arm64/libtest.so
sp_ecc_secret_gen_256
wolfcrypt/src/sp_arm64.c:39975:19
#03 0x000000000016b0c8 /data/app/~~KTk0gKq8ur1nMdN21wGDCQ==/com.test-BAnij8kkuaLOwIea5uCL8Q==/lib/arm64/libtest.so
wc_ecc_shared_secret_gen_sync
wolfcrypt/src/ecc.c:4417:15
wc_ecc_shared_secret_ex
wolfcrypt/src/ecc.c:4750:23
#04 0x00000000001e91fc /data/app/~~KTk0gKq8ur1nMdN21wGDCQ==/com.test-BAnij8kkuaLOwIea5uCL8Q==/lib/arm64/libtest.so
wc_ecc_shared_secret
wolfcrypt/src/ecc.c:4358:10
EccSharedSecret
src/internal.c:5061:19
TLSX_KeyShare_ProcessEcc
src/tls.c:7847:15
TLSX_KeyShare_Process
src/tls.c:8085:15
calvin2021y commented 1 year ago

for windows x86 lto, the location of error is not correct. I test with this patch:

diff --git a/wolfcrypt/src/aes.c b/wolfcrypt/src/aes.c
index 13e8fbbf6..a5f1b6af6 100644
--- a/wolfcrypt/src/aes.c
+++ b/wolfcrypt/src/aes.c
@@ -10689,7 +10689,9 @@ int wc_AesInit(Aes* aes, void* heap, int devId)
 #ifdef HAVE_AESGCM
 #ifdef OPENSSL_EXTRA
     XMEMSET(aes->aadH, 0, sizeof(aes->aadH));
+    fprintf( stderr, "aes=%p, %d before aadLen\n", aes, aes->aadLen);
     aes->aadLen = 0;
+       fprintf( stderr, "aes=%p, %d after aadLen\n", aes, aes->aadLen);
 #endif
 #endif

It print before and after. I use -O1 with lto build. I can not use -O0 because will triger a lot link error.

calvin2021y commented 1 year ago

The app is build with curl, and all case is work fine with openssl but not wolfssl.

SparkiDev commented 1 year ago

Hi @calvin2021y

Please make sure that you are linking against a wolfSSL library built with the same options that are used when building the application and/or library that uses wolfSSL.

I have a new patch. Remove the previous changes and apply this one. Let me know what is printed just before the crash.

Thanks, Sean issue_6119_patch_1.txt

calvin2021y commented 1 year ago

Thanks for the update. I can not apply this patch to stable release:

error: patch failed: wolfcrypt/src/sp_arm64.c:22597
error: wolfcrypt/src/sp_arm64.c: patch does not apply

Then I try apply it to master branch, there is no more crash any more for aarch64 + lto + asm

02-28 10:18:19.450   780   780 W test : a = 0x7fc74737f8
02-28 10:18:19.450   780   780 W test : b = 0x7fc74737b8
02-28 10:18:19.450   780   780 W test : r = 0x7fc74737b8
02-28 10:18:19.450   780   780 W test : a[0] =
02-28 10:18:19.450   780   780 W test : 1e68ae2dcd835f0c
02-28 10:18:19.450   780   780 W test : a[1] = 5c05abe04a7919cf
02-28 10:18:19.450   780   780 W test : a[2] = 491e2364cf0dfd76
02-28 10:18:19.450   780   780 W test : a[3] = acd15f93b1b19042
02-28 10:18:19.450   780   780 W test : b[0] = b81570316036225a
02-28 10:18:19.450   780   780 W test : b[1] = 9e9bad851efcf383
02-28 10:18:19.450   780   780 W test : b[2] = bb3921f032d658c2
02-28 10:18:19.450   780   780 W test : b[3] = bcc78d21e0763dcc
02-28 10:18:19.450   780   780 W test : r[0] = c50f6b0e275a092b
02-28 10:18:19.450   780   780 W test : r[1] = f09fe328dd16ea6a
02-28 10:18:19.450   780   780 W test : r[2] = df516e31bb6733a7
02-28 10:18:19.450   780   780 W test : r[3] = 90bd3686a7c535f7
02-28 10:18:19.450   780   780 W test : r[4] = 0000000000000000
02-28 10:18:19.450   780   780 W test : r[5] = 0000000000000000
02-28 10:18:19.450   780   780 W test : r[6] = 0000000000000000
02-28 10:18:19.450   780   780 W test : r[7] = 0000000000000000
02-28 10:18:19.450   780   780 W test : a = 0x7fc7475300
02-28 10:18:19.450   780   780 W test : b = 0x7fc74737f8
02-28 10:18:19.450   780   780 W test : r = 0x7fc7475520
02-28 10:18:19.450   780   780 W test : a[0] = 8b0fd96ac1e9149d
02-28 10:18:19.451   780   780 W test : a[1] = 0ce58e5b082fe052
02-28 10:18:19.451   780   780 W test : a[2] = 01d12adcf9c79d40
02-28 10:18:19.451   780   780 W test : a[3] = 795ec522739e8b09
02-28 10:18:19.451   780   780 W test : b[0] = 1e68ae2dcd835f0c
02-28 10:18:19.451   780   780 W test : b[1] = 5c05abe04a7919cf
02-28 10:18:19.451   780   780 W test : b[2] = 491e2364cf0dfd76
02-28 10:18:19.451   780   780 W test : b[3] = acd15f93b1b19042
02-28 10:18:19.451   780   780 W test : r[0] = 464517f09efda264
02-28 10:18:19.451   780   780 W test : r[1] = 21384c51e696b2f3
02-28 10:18:19.451   780   780 W test : r[2] = b1bdbdcdff344fc9
02-28 10:18:19.451   780   780 W test : r[3] = ac889c0d1f7715a2
02-28 10:18:19.451   780   780 W test : r[4] = 0000000000000000
02-28 10:18:19.451   780   780 W test : r[5] = 0000000000000000
02-28 10:18:19.451   780   780 W test : r[6] = 0000000000000000
02-28 10:18:19.451   780   780 W test : r[7] = 0000000000000000
02-28 10:18:19.451   780   780 W test : a = 0x7fc7475340
02-28 10:18:19.451   780   780 W test : b = 0x7fc74737b8
02-28 10:18:19.451   780   780 W test : r = 0x7fc7475560
02-28 10:18:19.451   780   780 W test : a[0] = baa88d11afa6af5f
02-28 10:18:19.451   780   780 W test : a[1] = 832a675a8cefc88f
02-28 10:18:19.451   780   780 W test : a[2] = 118c2d6d9968402d
02-28 10:18:19.451   780   780 W test : a[3] = f115a17c85e7af74
02-28 10:18:19.451   780   780 W test : b[0] = c50f6b0e275a092b
02-28 10:18:19.451   780   780 W test : b[1] = f09fe328dd16ea6a
02-28 10:18:19.451   780   780 W test : b[2] = df516e31bb6733a7
02-28 10:18:19.451   780   780 W test : b[3] = 90bd3686a7c535f7
02-28 10:18:19.451   780   780 W test : r[0] = 07ce4559a035a60e
02-28 10:18:19.451   780   780 W test : r[1] = 2d04b6a699c79b94
02-28 10:18:19.451   780   780 W test : r[2] = 9870ea1663376732
02-28 10:18:19.451   780   780 W test : r[3] = 962537a75401679a
02-28 10:18:19.451   780   780 W test : r[4] = 0000000000000000
02-28 10:18:19.451   780   780 W test : r[5] = 0000000000000000
02-28 10:18:19.451   780   780 W test : r[6] = 0000000000000000
02-28 10:18:19.451   780   780 W test : r[7] = 0000000000000000
02-28 10:18:19.451   780   780 W test : dec = 0xb400007262667c58
02-28 10:18:19.451   780   780 W test : dec->aes = 0xb4000071f2662d00
02-28 10:18:19.451   780   780 W test : dec->aes->aadLen = 0
02-28 10:18:19.451   780   780 W test : dec = 0xb400007262667c58
02-28 10:18:19.451   780   780 W test : dec->aes = 0xb4000071f2662d00
02-28 10:18:19.451   780   780 W test : dec->aes->aadLen = 0

I am sure I link with the new build static library with absolute path, --enable-sp=small,asm enabled.

SparkiDev commented 1 year ago

Hi @calvin2021y,

There have been fixes to the sp_arm64.c file since 5.5.4. Please use the latest version of sp_ar,64.c if possible.

Could you apply the patch to the keys.c on Windows x86?

Thanks, Sean

calvin2021y commented 1 year ago

with your patch, master, windows x86, clang + lto:

dec = 3016906c                                                                                                                                                                                       
dec->aes = 30167868                                                                                                                                                                                  
dec->aes->aadLen = 0                                                                                                                                                                                 

 thread #1, stop reason = Exception 0xc0000005 encountered at address 0x753977: Access violation reading location 0xffffffff                                                                         
    frame #0: 0x00753977 test.exe`SetKeysSide [inlined] wc_AesInit(aes=0x30167868, heap=<unavailable>, devId=<unavailable>) at aes.c:10692:17                                                        

  * frame #0: 0x00753977 test.exe`SetKeysSide [inlined] wc_AesInit(aes=0x30167868, heap=<unavailable>, devId=<unavailable>) at aes.c:10692:17                                                     
    frame #1: 0x00753961 test.exe`SetKeysSide at keys.c:2487:17                                                                                                                                   
    frame #2: 0x00753011 test.exe`SetKeysSide(ssl=0x30168fa0, side=<unavailable>) at keys.c:2986:15                                                                                               
    frame #3: 0x0070b6cb test.exe`DoTls13HandShakeMsgType(ssl=0x30168fa0, input=<unavailable>, inOutIdx=<unavailable>, type=<unavailable>, size=<unavailable>, totalSz=<unavailable>) at tls13.c:0
    frame #4: 0x006f6006 test.exe`ProcessReplyEx [inlined] DoTls13HandShakeMsg(ssl=<unavailable>, input=<unavailable>, inOutIdx=<unavailable>, totalSz=<unavailable>) at tls13.c:0                
    frame #5: 0x006f5fed test.exe`ProcessReplyEx(ssl=<unavailable>, allowSocketErr=0) at internal.c:19828:31                                                                                      
    frame #6: 0x005e756d test.exe`wolfSSL_connect [inlined] ProcessReply(ssl=0x30168fa0) at internal.c:19095:12                                                                                   
    frame #7: 0x005e7566 test.exe`wolfSSL_connect(ssl=0x30168fa0) at ssl.c:13334:36  
SparkiDev commented 1 year ago

Hi @calvin2021y,

Can you please tell me what the line at aes.c:10692:17 looks like for you? Looking at v5.5.4-stable, the line is:

    aes->aad_len = 0

Also, keys.c:2487 This doesn't line up with anything in v5.5.4-stable.

Sean

calvin2021y commented 1 year ago

The test is on master branch. I am sorry I dont know the build commit id. toady I try there is a error:

 error: Error including wolfssl/openssl/*.h. Possible circular dependency introduced or missing include.

will try reproduce this again

calvin2021y commented 1 year ago

try use commit #98c1b152a for master branch windows x86 + lto

there is nothing printed.

calvin2021y commented 1 year ago

with this patch:

diff --git a/wolfcrypt/src/aes.c b/wolfcrypt/src/aes.c
index 63928bafe..e46f66f5e 100644
--- a/wolfcrypt/src/aes.c
+++ b/wolfcrypt/src/aes.c
@@ -10688,8 +10688,11 @@ int wc_AesInit(Aes* aes, void* heap, int devId)

 #ifdef HAVE_AESGCM
 #ifdef OPENSSL_EXTRA
+    fprintf( stderr, "aes=%p\n", aes);
     XMEMSET(aes->aadH, 0, sizeof(aes->aadH));
+       fprintf( stderr, "aes=%p, %d before aadLen\n", aes, aes->aadLen);
     aes->aadLen = 0;
+       fprintf( stderr, "aes=%p, %d after aadLen\n", aes, aes->aadLen);
 #endif

only print aes=3047b408

SparkiDev commented 1 year ago

Hi @calvin2021y

It appears it is the memseting of the aes->aadH that is the issue.

Can you print out the address of aes->aadH before the XMEMSET? The Aes structure has aadH declared:

#ifdef OPENSSL_EXTRA
    word32 aadH[4];
    word32 aadLen;
#endif

The only way I can think of that the field aadH could not be accessible is if one file was compiled up with OPENSSL_EXTRA defined and another without.

Thanks, Sean

calvin2021y commented 1 year ago

this is the output:

aes=30553980, 16, 30553ac0
aes=30553d08, 16, 30553e48
diff --git a/wolfcrypt/src/aes.c b/wolfcrypt/src/aes.c
index 63928bafe..341d6e765 100644
--- a/wolfcrypt/src/aes.c
+++ b/wolfcrypt/src/aes.c
@@ -10688,6 +10688,8 @@ int wc_AesInit(Aes* aes, void* heap, int devId)

 #ifdef HAVE_AESGCM
 #ifdef OPENSSL_EXTRA
+    fprintf(stderr, "aes=%p, %zu, %p\n", aes, sizeof(aes->aadH), aes->aadH);
+       fflush(stderr);
     XMEMSET(aes->aadH, 0, sizeof(aes->aadH));
     aes->aadLen = 0;
 #endif

I guess if the problem is come with OPENSSL_EXTRA, the without LTO version should also have same problem ? ( it not crash, but give a ASN sig error for https://1.1.1.1)

SparkiDev commented 1 year ago

Hi @calvin2021y

Please put the print out into keys.c, instead of aes.c, before the call that crashed and let me know what happens.

Thanks, Sean

calvin2021y commented 1 year ago
aes123456=304b8a98, 304b8bd8  
diff --git a/src/keys.c b/src/keys.c
index f9da104e0..6f041c7b0 100644
--- a/src/keys.c
+++ b/src/keys.c
@@ -2475,6 +2475,7 @@ static int SetKeys(Ciphers* enc, Ciphers* dec, Keys* keys, CipherSpecs* specs,
         }

         if (enc) {
+                       fprintf(stderr, "aes123456=%p, %p\n", enc->aes, enc->aes->aadH);
             if (wc_AesInit(enc->aes, heap, devId) != 0) {
                 WOLFSSL_MSG("AesInit failed in SetKeys");
                 return ASYNC_INIT_E;
SparkiDev commented 1 year ago

Did the code crash this time> If not, then it appears to be a compiler bug. You may have to reduce the optimization level until the code works.

Sean

calvin2021y commented 1 year ago

crashed as before.
User minimal -O1 since -O0 will give link error.

calvin2021y commented 1 year ago

if this is a compiler bug, could related to https://github.com/mstorsjo/llvm-mingw/issues/326

SparkiDev commented 1 year ago

Hi @calvin2021y,

The compiler appears to be unstable from what I've seen here.

What are the link errors when -O0?

Sean

calvin2021y commented 1 year ago

The link error with -O0 related with some ungenerate interface symbol from my code. (I dont think that is related into this problem)

It cloud be clang bugs. I use this clang compile a lot project, few of them has problem. (include sqlite openssl curl libuv, all big user base projects work fine).

please let me know if there is any more test I can do to rule out wolfssl bugs possibility, or confirm the clang compiler bug.

SparkiDev commented 1 year ago

Hi @calvin2021y

Please try -fno-inline with clang and let us know if that makes a difference. I believe that will stop inlining of functions like the wc_AesInit(). I've asked other developers in wolfSSL about this and they think this is likely to be a compiler bug.

Sean

calvin2021y commented 1 year ago

fresh rebuild -fno-inline for worlfssl and app. has the same error

SparkiDev commented 1 year ago

Hi @calvin2021y

I'm really frustrated by this one. Can I ask you to try one more thing?

Your previous patch was:

diff --git a/wolfcrypt/src/aes.c b/wolfcrypt/src/aes.c
index 63928bafe..341d6e765 100644
--- a/wolfcrypt/src/aes.c
+++ b/wolfcrypt/src/aes.c
@@ -10688,6 +10688,8 @@ int wc_AesInit(Aes* aes, void* heap, int devId)

 #ifdef HAVE_AESGCM
 #ifdef OPENSSL_EXTRA
+    fprintf(stderr, "aes=%p, %zu, %p\n", aes, sizeof(aes->aadH), aes->aadH);
+       fflush(stderr);
     XMEMSET(aes->aadH, 0, sizeof(aes->aadH));
     aes->aadLen = 0;
 #endif

Can you add the fprintf and flush calls after the XMEMSET too? I would like to see if the XMEMSET is changing anything.

Thanks, Sean

calvin2021y commented 1 year ago
diff --git a/src/keys.c b/src/keys.c
index f9da104e0..6f041c7b0 100644
--- a/src/keys.c
+++ b/src/keys.c
@@ -2475,6 +2475,7 @@ static int SetKeys(Ciphers* enc, Ciphers* dec, Keys* keys, CipherSpecs* specs,
         }

         if (enc) {
+                       fprintf(stderr, "aes123456=%p, %p\n", enc->aes, enc->aes->aadH);
             if (wc_AesInit(enc->aes, heap, devId) != 0) {
                 WOLFSSL_MSG("AesInit failed in SetKeys");
                 return ASYNC_INIT_E;

diff --git a/wolfcrypt/src/aes.c b/wolfcrypt/src/aes.c
index 63928bafe..4313e8e3a 100644
--- a/wolfcrypt/src/aes.c
+++ b/wolfcrypt/src/aes.c
@@ -10688,7 +10688,11 @@ int wc_AesInit(Aes* aes, void* heap, int devId)

 #ifdef HAVE_AESGCM
 #ifdef OPENSSL_EXTRA
+    fprintf(stderr, "AES1=%p, %zu, %p\n", aes, sizeof(aes->aadH), aes->aadH);
+    fflush(stderr);
     XMEMSET(aes->aadH, 0, sizeof(aes->aadH));
+       fprintf(stderr, "AES2=%p\n", aes);
+       fflush(stderr);
     aes->aadLen = 0;
 #endif
 #endif

seems crash inside XMEMSET

aes123456=3030bfe8, 3030c128
AES1=3030bfe8, 16, 3030c128

* thread #1, stop reason = Exception 0xc0000005 encountered at address 0x6ef89b: Access violation reading location 0xffffffff
    frame #0: 0x006ef89b test.exe`wc_AesInit(aes=0x3030bfe8, heap=<unavailable>, devId=<unavailable>) at aes.c:10693:5
Process 3468 launched: 'C:\Users\admin\test.exe' (i386)
(lldb) bt
* thread #1, stop reason = Exception 0xc0000005 encountered at address 0x6ef89b: Access violation reading location 0xffffffff
  * frame #0: 0x006ef89b test.exe`wc_AesInit(aes=0x3030bfe8, heap=<unavailable>, devId=<unavailable>) at aes.c:10693:5
    frame #1: 0x00761297 test.exe`SetKeys(enc=0x302ed2b4, dec=0x302ed2cc, keys=<unavailable>, specs=<unavailable>, side=<unavailable>, heap=<unavailable>, devId=<unavailable>, rng=<unavailable>, tls13=<unavailable>) at keys.c:2479
:17
    frame #2: 0x00760cc0 test.exe`SetKeysSide(ssl=0x302ed200, side=<unavailable>) at keys.c:2984:15
    frame #3: 0x00728819 test.exe`DoTls13HandShakeMsgType(ssl=<unavailable>, input=<unavailable>, inOutIdx=<unavailable>, type=<unavailable>, size=<unavailable>, totalSz=<unavailable>) at tls13.c:0
    frame #4: 0x00729ced test.exe`DoTls13HandShakeMsg(ssl=0x302ed200, input=<unavailable>, inOutIdx=<unavailable>, totalSz=<unavailable>) at tls13.c:0
    frame #5: 0x007171d6 test.exe`ProcessReplyEx(ssl=<unavailable>, allowSocketErr=0) at internal.c:19828:31
    frame #6: 0x00729f3f test.exe`wolfSSL_connect_TLSv13(ssl=0x302ed200) at tls13.c:11525:35
    frame #7: 0x0074d869 test.exe`wolfssl_connect_common [inlined] wolfssl_connect_step2(cf=<unavailable>, data=<unavailable>) at wolfssl.c:711:9
    frame #8: 0x0074d761 test.exe`wolfssl_connect_common(cf=<unavailable>, data=<unavailable>, nonblocking=<unavailable>, done=0x2dd5d80b) at wolfssl.c:1196:14
    frame #9: 0x006c7983 test.exe`ssl_cf_connect [inlined] wolfssl_connect_nonblocking(cf=<unavailable>, data=<unavailable>, done=0x2dd5d80b) at wolfssl.c:1228:10
    frame #10: 0x006c7977 test.exe`ssl_cf_connect [inlined] ssl_connect_nonblocking(cf=<unavailable>, data=<unavailable>, done=0x2dd5d80b) at vtls.c:348:10
    frame #11: 0x006c7969 test.exe`ssl_cf_connect(cf=<unavailable>, data=<unavailable>, blocking=<unavailable>, done=<unavailable>) at vtls.c:1534:14
    frame #12: 0x006c399d test.exe`cf_setup_connect [inlined] Curl_conn_cf_connect(cf=<unavailable>, data=<unavailable>, blocking=<unavailable>, done=<unavailable>) at cfilters.c:307:12
    frame #13: 0x006c398c test.exe`cf_setup_connect(cf=<unavailable>, data=<unavailable>, blocking=<unavailable>, done=<unavailable>) at connect.c:1164:14
    frame #14: 0x007479b8 test.exe`cf_hc_connect [inlined] Curl_conn_cf_connect(cf=<unavailable>, data=<unavailable>, blocking=false, done=<unavailable>) at cfilters.c:307:12
    frame #15: 0x007479aa test.exe`cf_hc_connect [inlined] cf_hc_baller_connect(b=<unavailable>, cf=0x302c1b60, data=<unavailable>, done=<unavailable>) at cf-http.c:135:15
    frame #16: 0x007479a4 test.exe`cf_hc_connect(cf=0x302c1b60, data=0x30283c18, blocking=<unavailable>, done=0x2dd5d80b) at cf-http.c:288:16
    frame #17: 0x006d1012 test.exe`Curl_conn_connect(data=0x30283c18, sockindex=0, blocking=false, done=0x2dd5d80b) at cfilters.c:370:14
    frame #18: 0x00692b47 test.exe`multi_runsingle(multi=<unavailable>, nowp=<unavailable>, data=0x30283c18) at multi.c:2094:16
calvin2021y commented 1 year ago

hi @SparkiDev

I has to use openssl to avoid this problem at win32, that is very complex for me because I use wssl for all other platform.

today I do more test without LTO, find this problem. can you take a look at this:

* thread #1, stop reason = Exception 0xc0000005 encountered at address 0xdb4aac: Access violation reading location 0xffffffff                                                                                   
    frame #0: 0x00db4aac test.exe`HashSkeData(ssl=0x428280c8, hashType=<unavailable>, data=<unavailable>, sz=<unavailable>, sigAlgo=<unavailable>) at internal.c:26227:50                                       
Process 11124 launched: 'C:\Users\admin\test.exe' (i386)                                                                                                                                                        
(lldb) bt                                                                                                                                                                                                       
* thread #1, stop reason = Exception 0xc0000005 encountered at address 0xdb4aac: Access violation reading location 0xffffffff                                                                                   
  * frame #0: 0x00db4aac test.exe`HashSkeData(ssl=0x428280c8, hashType=<unavailable>, data=<unavailable>, sz=<unavailable>, sigAlgo=<unavailable>) at internal.c:26227:50                                       
    frame #1: 0x00db849e test.exe`DoServerKeyExchange(ssl=<unavailable>, input=<unavailable>, inOutIdx=0x428281c8, size=397) at internal.c:28263:27                                                             
    frame #2: 0x00db759b test.exe`DoHandShakeMsgType(ssl=0x428280c8, input="\f", inOutIdx=<unavailable>, type=<unavailable>, size=<unavailable>, totalSz=<unavailable>) at internal.c:15759:15                  
    frame #3: 0x00da9c0e test.exe`ProcessReplyEx at internal.c:15976:15                                                                                                                                         
    frame #4: 0x00da9bfb test.exe`ProcessReplyEx(ssl=<unavailable>, allowSocketErr=<unavailable>) at internal.c:19876:31    frame #5: 0x00da836b test.exe`ProcessReply(ssl=<unavailable>) at internal.c:19135:12
    frame #6: 0x00c3c288 test.exe`wolfSSL_connect(ssl=<unavailable>) at ssl.c:13395:36                                                                                                                          
    frame #7: 0x00e0541c test.exe`wolfssl_connect_common [inlined] wolfssl_connect_step2(cf=0x426c87e8, data=0x427ee110) at wolfssl.c:720:9                                                                     
    frame #8: 0x00e052e6 test.exe`wolfssl_connect_common(cf=0x426c87e8, data=<unavailable>, nonblocking=<unavailable>, done=<unavailable>) at wolfssl.c:1225:14                                                 
    frame #9: 0x00e04343 test.exe`wolfssl_connect_nonblocking(cf=0x426c87e8, data=0x427ee110, done=0x2eb9e430) at wolfssl.c:1257:10                                                                             
    frame #10: 0x00cd0d4a test.exe`ssl_cf_connect [inlined] ssl_connect_nonblocking(cf=<unavailable>, data=0x427ee110, done=<unavailable>) at vtls.c:375:10                                                     
    frame #11: 0x00cd0cfc test.exe`ssl_cf_connect(cf=0x426c87e8, data=0x427ee110, blocking=<unavailable>, done=0x2eb9e430) at vtls.c:1547:14                                                                    
    frame #12: 0x00d1ad0e test.exe`cf_setup_connect(cf=<unavailable>, data=<unavailable>, blocking=<unavailable>, done=<unavailable>) at connect.c:1196:14                                                      
    frame #13: 0x00e4252e test.exe`cf_hc_connect [inlined] cf_hc_baller_connect(b=<unavailable>, cf=<unavailable>, data=<unavailable>, done=<unavailable>) at cf-https-connect.c:135:15                         
    frame #14: 0x00e424d0 test.exe`cf_hc_connect(cf=0x42668020, data=0x427ee110, blocking=false, done=0x2eb9e430) at cf-https-connect.c:290:16                                                                  
    frame #15: 0x00d11ab7 test.exe`Curl_conn_connect(data=0x427ee110, sockindex=0, blocking=<unavailable>, done=0x2eb9e430) at cfilters.c:351:14                                                                
    frame #16: 0x00c0c99e test.exe`multi_runsingle(multi=<unavailable>, nowp=<unavailable>, data=<unavailable>) at multi.c:2109:16                                                                              
    frame #17: 0x00c1650f test.exe`multi_socket(multi=<unavailable>, checkall=<unavailable>, s=<unavailable>, ev_bitmask=<unavailable>, running_handles=<unavailable>) at multi.c:3270:16                       
    frame #18: 0x00c16de5 test.exe`curl_multi_socket_action(multi=0x42707d88, s=868, ev_bitmask=1, running_handles=0x2eb9e870) at multi.c:3392:12 
SparkiDev commented 1 year ago

Hi @calvin2021y,

Can you confirm which pointer is 0xffffffff?

ssl->buffers.sig.buffer and ssl->buffers.sig.digest should be allocated in function HashSkeData(). Can you also please check the memory allocation results to see if either of these end up as 0xfffffff?

Thanks, Sean

calvin2021y commented 1 year ago

hi @SparkiDev

I am very very sorry for waste your time and provide a lot wrong information. (I guess the llvm mingw toolschain give me the wrong debug data, but not for sure)

I try a lot method to debug this problem, at last it turn out I can lot use lto with quickjs for win32. turn off quickjs lto but keep curl & wssl lto, make the problem fixed.

calvin2021y commented 1 year ago

I do a lot more test with different compilation combinations. finally I get this result with quickjs LTO + wolfssl LTO: (for i686 target)

use exists library(all LTO, without asan) and objects files(all with LTO, without asan), just change the link method will make diff result:

link with libclang_rt.asan_dynamic-i386.dll.a libclang_rt.asan_dynamic_runtime_thunk-i386.a, no crash

link without asan, crashed.

all file to link is build without asan, just put the asan library into link stage will made the problem fixed. (I want to distribute binary without extra asan dll)

any idea what could cause this ?

any combination with openssl work.

SparkiDev commented 1 year ago

Using the address sanitiser version of the libraries should not make a difference. It does do more checking, but it should be finding problems rather than fixing problems.

Make sure the versions of the libraries are the same.

Could you please check where the address of 0xfffffff is coming from. It would be helpful to know whether it is coming form malloc. Do you have a custom version of malloc that you use?

Sean

calvin2021y commented 1 year ago

Using the address sanitiser version of the libraries should not make a difference. It does do more checking, but it should be finding problems rather than fixing problems.

Make sure the versions of the libraries are the same.

Here is a update about the recent details: https://github.com/mstorsjo/llvm-mingw/issues/347

Could you please check where the address of 0xfffffff is coming from. It would be helpful to know whether it is coming form malloc. Do you have a custom version of malloc that you use?

I use this patch

diff --git a/src/keys.c b/src/keys.c
index 8f960ba0e..9a3c3c04d 100644
--- a/src/keys.c
+++ b/src/keys.c
@@ -2233,13 +2233,13 @@ static int SetPrefix(byte* sha_input, int idx)
 }
 #endif

-
+__attribute__((noinline))
 static int SetKeys(Ciphers* enc, Ciphers* dec, Keys* keys, CipherSpecs* specs,
                    int side, void* heap, int devId, WC_RNG* rng, int tls13)
 {
     (void)rng;
     (void)tls13;
-
+    printf("SetKeys %d heap=%p\n", __LINE__, heap);
 #ifdef BUILD_ARC4
     if (specs->bulk_cipher_algorithm == wolfssl_rc4) {
         word32 sz = specs->key_size;
@@ -2458,12 +2458,16 @@ static int SetKeys(Ciphers* enc, Ciphers* dec, Keys* keys, CipherSpecs* specs,
             XMEMSET(dec->aes, 0, sizeof(Aes));
         }
         if (enc) {
+           fprintf(stderr, "aes123456=%p, %p\n", enc->aes, enc->aes->aadH);
             if (wc_AesInit(enc->aes, heap, devId) != 0) {
                 WOLFSSL_MSG("AesInit failed in SetKeys");
                 return ASYNC_INIT_E;
             }
         }
         if (dec) {
+            fprintf(stderr, "dec = %p\n", dec);
+            fprintf(stderr, "dec->aes = %p\n", dec->aes);
+            fprintf(stderr, "dec->aes->aadLen = %d\n", dec->aes->aadLen);
             if (wc_AesInit(dec->aes, heap, devId) != 0) {
                 WOLFSSL_MSG("AesInit failed in SetKeys");
                 return ASYNC_INIT_E;
@@ -2544,6 +2548,7 @@ static int SetKeys(Ciphers* enc, Ciphers* dec, Keys* keys, CipherSpecs* specs,
         }

         if (enc) {
+           fprintf(stderr, "wc_AesInit=%p, heap=%p, devId=%d\n", enc->aes, heap, devId);
             if (wc_AesInit(enc->aes, heap, devId) != 0) {
                 WOLFSSL_MSG("AesInit failed in SetKeys");
                 return ASYNC_INIT_E;
diff --git a/wolfcrypt/src/aes.c b/wolfcrypt/src/aes.c
index 042331ad4..fc5df967d 100644
--- a/wolfcrypt/src/aes.c
+++ b/wolfcrypt/src/aes.c
@@ -9572,7 +9572,11 @@ int wc_AesInit(Aes* aes, void* heap, int devId)

 #ifdef HAVE_AESGCM
 #ifdef OPENSSL_EXTRA
+    fprintf(stderr, "AES1=%p, %zu, %p\n", aes, sizeof(aes->aadH), aes->aadH);
+   fflush(stderr);
     XMEMSET(aes->aadH, 0, sizeof(aes->aadH));
+    fprintf( stderr, "aes=%p before aadLen\n", aes);
+   fflush(stderr);
     aes->aadLen = 0;
 #endif
 #endif

result:

SetKeys 2242 heap=00000000
wc_AesInit=00A20E08, heap=00000000, devId=-2
AES1=00A20E08, 16, 00A20F48                 
Segmentation fault 

(ocess 9100 stoppedindexing DWARF for test.exe...
* thread #1, stop reason = Exception 0xc0000005 encountered at address 0xd2097d: Access violation reading location 0xffffffff
    frame #0: 0x00d2097d test.exe`wc_AesInit(aes=<unavailable>, heap=<unavailable>, devId=<unavailable>) at aes.c:9577:5Process 9100 launched: 'C:\Users\admin\test.exe' (i386)
(lldb) bt
* thread #1, stop reason = Exception 0xc0000005 encountered at address 0xd2097d: Access violation reading location 0xffffffff
  * frame #0: 0x00d2097d test.exe`wc_AesInit(aes=<unavailable>, heap=<unavailable>, devId=<unavailable>) at aes.c:9577:5    frame #1: 0x00daadfc test.exe`SetKeys(enc=0x30eac0b4, dec=0x00000000, keys=<unavailable>, specs=<unavailable>, side=<unavailable>, heap=<unavailable>, 
devId=<unavailable>, rng=<unavailable>, tls13=<unavailable>) at keys.c:2551:17
    frame #2: 0x00daa726 test.exe`SetKeysSide(ssl=<unavailable>, side=<unavailable>) at keys.c:3056:15
    frame #3: 0x00d52a75 test.exe`SendChangeCipher(ssl=<unavailable>) at internal.c:20345:20
    frame #4: 0x00cdf662 test.exe`wolfSSL_connect(ssl=<unavailable>) at ssl.c:13565:32
    frame #5: 0x00d7e224 test.exe`wolfssl_connect_common [inlined] wolfssl_connect_step2(cf=<unavailable>, data=<unavailable>) at wolfssl.c:720:9
    frame #6: 0x00d7e1ae test.exe`wolfssl_connect_common(cf=<unavailable>, data=<unavailable>, nonblocking=<unavailable>, done=<unavailable>) at wolfssl.c:1225:14
    frame #7: 0x00d7de13 test.exe`wolfssl_connect_nonblocking(cf=<unavailable>, data=<unavailable>, done=<unavailable>) at wolfssl.c:1257:10
    frame #8: 0x00cf7df8 test.exe`ssl_cf_connect [inlined] ssl_connect_nonblocking(cf=<unavailable>, data=0x30ea6db0, done=<unavailable>) at vtls.c:375:10
    frame #9: 0x00cf7dd7 test.exe`ssl_cf_connect(cf=0x2ec2a1b8, data=0x30ea6db0, blocking=<unavailable>, done=0x2e7ce5df) at vtls.c:1547:14
    frame #10: 0x00d07718 test.exe`cf_setup_connect(cf=<unavailable>, data=<unavailable>, blocking=<unavailable>, done=<unavailable>) at connect.c:1196:14
    frame #11: 0x00d8b738 test.exe`cf_hc_connect [inlined] cf_hc_baller_connect(b=<unavailable>, cf=<unavailable>, data=<unavailable>, done=<unavailable>) at cf-https-connect.c:135:15
    frame #12: 0x00d8b724 test.exe`cf_hc_connect(cf=0x2ec2e088, data=0x30ea6db0, blocking=<unavailable>, done=0x2e7ce5df) at cf-https-connect.c:290:16
    frame #13: 0x00d055c2 test.exe`Curl_conn_connect(data=0x30ea6db0, sockindex=0, blocking=<unavailable>, done=0x2e7ce5df) at cfilters.c:351:14
    frame #14: 0x00cd7d4c test.exe`multi_runsingle(multi=<unavailable>, nowp=<unavailable>, data=0x30ea6db0) at multi.c:2109:16
    frame #15: 0x00cd9981 test.exe`multi_socket(multi=<unavailable>, checkall=<unavailable>, s=788, ev_bitmask=1, running_handles=0x2e7ce6a4) at multi.c:3270:16
    frame #16: 0x00cd9b05 test.exe`curl_multi_socket_action(multi=<unavailable>, s=<unavailable>, ev_bitmask=<unavailable>, running_handles=<unavailable>) at multi.c:3392:12

I am not sure how to trace down where the 0xfffffff come from, can you give more tips about the patch ?

@SparkiDev

calvin2021y commented 1 year ago

The crash cause by XMEMSET(aes->aadH, 0, sizeof(aes->aadH)); , right after fprintf(stderr, "AES1=%p, %zu, %p\n", aes, sizeof(aes->aadH), aes->aadH);

calvin2021y commented 1 year ago

Make sure the versions of the libraries are the same.

I am sure they are same, I try from fresh install and build to fix this problem.

calvin2021y commented 1 year ago

0xfffffff come from XMEMSET(aes->aadH, 0, sizeof(aes->aadH)); ?

calvin2021y commented 1 year ago

the crash location not always at same location:

(ocess 12692 stoppedndexing DWARF for test.exe...                                                                                                                                                               
* thread #1, stop reason = Exception 0xc0000005 encountered at address 0x4dbabc: Access violation reading location 0xffffffff                                                                                   
    frame #0: 0x004dbabc test.exe`HashSkeData(ssl=0x305e0158, hashType=<unavailable>, data=<unavailable>, sz=<unavailable>, sigAlgo=<unavailable>) at internal.c:26227:50                                       
Process 12692 launched: 'C:\Users\admin\test.exe' (i386)                                                                                                                                                        
(lldb) bt                                                                                                                                                                                                       
* thread #1, stop reason = Exception 0xc0000005 encountered at address 0x4dbabc: Access violation reading location 0xffffffff                                                                                   
  * frame #0: 0x004dbabc test.exe`HashSkeData(ssl=0x305e0158, hashType=<unavailable>, data=<unavailable>, sz=<unavailable>, sigAlgo=<unavailable>) at internal.c:26227:50                                       
    frame #1: 0x004df534 test.exe`DoServerKeyExchange(ssl=<unavailable>, input=<unavailable>, inOutIdx=<unavailable>, size=<unavailable>) at internal.c:28263:27                                                
    frame #2: 0x004de3d6 test.exe`DoHandShakeMsgType(ssl=0x305e0158, input="\f", inOutIdx=<unavailable>, type=<unavailable>, size=<unavailable>, totalSz=<unavailable>) at internal.c:15759:15                  
    frame #3: 0x004d142e test.exe`ProcessReplyEx at internal.c:15976:15                                                                                                                                         
    frame #4: 0x004d141b test.exe`ProcessReplyEx(ssl=<unavailable>, allowSocketErr=<unavailable>) at internal.c:19876:31    frame #5: 0x004cfb7b test.exe`ProcessReply(ssl=<unavailable>) at internal.c:19135:12
    frame #6: 0x0045f5db test.exe`wolfSSL_connect(ssl=<unavailable>) at ssl.c:13395:36                                                                                                                          
    frame #7: 0x004fe224 test.exe`wolfssl_connect_common [inlined] wolfssl_connect_step2(cf=<unavailable>, data=<unavailable>) at wolfssl.c:720:9                                                               
    frame #8: 0x004fe1ae test.exe`wolfssl_connect_common(cf=<unavailable>, data=<unavailable>, nonblocking=<unavailable>, done=<unavailable>) at wolfssl.c:1225:14                                              
    frame #9: 0x004fde13 test.exe`wolfssl_connect_nonblocking(cf=<unavailable>, data=<unavailable>, done=<unavailable>) at wolfssl.c:1257:10                                                                    
    frame #10: 0x00477df8 test.exe`ssl_cf_connect [inlined] ssl_connect_nonblocking(cf=<unavailable>, data=0x305f5ff8, done=<unavailable>) at vtls.c:375:10                                                     
    frame #11: 0x00477dd7 test.exe`ssl_cf_connect(cf=0x2e488460, data=0x305f5ff8, blocking=<unavailable>, done=0x2e15e5bf) at vtls.c:1547:14                                                                    
    frame #12: 0x00487718 test.exe`cf_setup_connect(cf=<unavailable>, data=<unavailable>, blocking=<unavailable>, done=<unavailable>) at connect.c:1196:14                                                      
    frame #13: 0x0050b738 test.exe`cf_hc_connect [inlined] cf_hc_baller_connect(b=<unavailable>, cf=<unavailable>, data=<unavailable>, done=<unavailable>) at cf-https-connect.c:135:15                         
    frame #14: 0x0050b724 test.exe`cf_hc_connect(cf=0x2e44e1a0, data=0x305f5ff8, blocking=<unavailable>, done=0x2e15e5bf) at cf-https-connect.c:290:16                                                          
    frame #15: 0x004855c2 test.exe`Curl_conn_connect(data=0x305f5ff8, sockindex=0, blocking=<unavailable>, done=0x2e15e5bf) at cfilters.c:351:14                                                                
    frame #16: 0x00457d4c test.exe`multi_runsingle(multi=<unavailable>, nowp=<unavailable>, data=0x305f5ff8) at multi.c:2109:16                                                                                 
    frame #17: 0x00459981 test.exe`multi_socket(multi=<unavailable>, checkall=<unavailable>, s=664, ev_bitmask=1, running_handles=0x2e15e684) at multi.c:3270:16                                                
    frame #18: 0x00459b05 test.exe`curl_multi_socket_action(multi=<unavailable>, s=<unavailable>, ev_bitmask=<unavailable>, running_handles=<unavailable>) at multi.c:3392:12  
calvin2021y commented 1 year ago

could be a thread race? (maybe with asan or -O0, too slow to trigger race).

I do more test, with -O1 it not always crash at same location. and not always crash. (without lldb debug could run ok very few times)

with lldb debug most time crash at same location.

SparkiDev commented 1 year ago

Hi @calvin2021y,

It is possible that it is a threading issue. Memseting one piece of memory should not invalidate a variable in another place.

Please check the address of the two buffers. That is, does the memory allocated to aes have any overlap with ssl->buffers.

My best guess is that the memory allocation is not working. That is, the same area of memory is allocated to two pointers. This can happen if the memory allocation function is faulty or not thread safe. Can you confirm that normal malloc() and free() functions are being used? Maybe a fast malloc is being used or a special static memory allocator?

Thanks, Sean

calvin2021y commented 1 year ago

I am use default malloc. I am not sure how to check ssl->buffers pointer. I add this patch for aes:

diff --git a/src/keys.c b/src/keys.c
index 8f960ba0e..0dfe2749d 100644
--- a/src/keys.c
+++ b/src/keys.c
@@ -2233,13 +2233,13 @@ static int SetPrefix(byte* sha_input, int idx)
 }
 #endif

-
+__attribute__((noinline))
 static int SetKeys(Ciphers* enc, Ciphers* dec, Keys* keys, CipherSpecs* specs,
                    int side, void* heap, int devId, WC_RNG* rng, int tls13)
 {
     (void)rng;
     (void)tls13;
-
+    printf("SetKeys %d heap=%p\n", __LINE__, heap);
 #ifdef BUILD_ARC4
     if (specs->bulk_cipher_algorithm == wolfssl_rc4) {
         word32 sz = specs->key_size;
@@ -2458,12 +2458,14 @@ static int SetKeys(Ciphers* enc, Ciphers* dec, Keys* keys, CipherSpecs* specs,
             XMEMSET(dec->aes, 0, sizeof(Aes));
         }
         if (enc) {
+           fprintf(stderr, "[%d] aes=%p, %zu\n", __LINE__, enc->aes, sizeof((enc->aes)[0]));
             if (wc_AesInit(enc->aes, heap, devId) != 0) {
                 WOLFSSL_MSG("AesInit failed in SetKeys");
                 return ASYNC_INIT_E;
             }
         }
         if (dec) {
+           fprintf(stderr, "[%d] dev=%p, %zu\n", __LINE__, dec->aes, sizeof((dec->aes)[0]));
             if (wc_AesInit(dec->aes, heap, devId) != 0) {
                 WOLFSSL_MSG("AesInit failed in SetKeys");
                 return ASYNC_INIT_E;
@@ -2544,6 +2546,8 @@ static int SetKeys(Ciphers* enc, Ciphers* dec, Keys* keys, CipherSpecs* specs,
         }

         if (enc) {
+           fprintf(stderr, "wc_AesInit heap=%p, devId=%d\n", heap, devId);
+           fprintf(stderr, "[%d] aes=%p, %d\n", __LINE__, enc->aes, sizeof((enc->aes)[0]));
             if (wc_AesInit(enc->aes, heap, devId) != 0) {
                 WOLFSSL_MSG("AesInit failed in SetKeys");
                 return ASYNC_INIT_E;
diff --git a/wolfcrypt/src/aes.c b/wolfcrypt/src/aes.c
index 042331ad4..fc5df967d 100644
--- a/wolfcrypt/src/aes.c
+++ b/wolfcrypt/src/aes.c
@@ -9572,7 +9572,11 @@ int wc_AesInit(Aes* aes, void* heap, int devId)

 #ifdef HAVE_AESGCM
 #ifdef OPENSSL_EXTRA
+    fprintf(stderr, "AES1=%p, %zu, %p\n", aes, sizeof(aes->aadH), aes->aadH);
+   fflush(stderr);
     XMEMSET(aes->aadH, 0, sizeof(aes->aadH));
+    fprintf( stderr, "aes=%p before aadLen\n", aes);
+   fflush(stderr);
     aes->aadLen = 0;
 #endif
 #endif

I also enable debug, some time it crash without related into aes

Shrinking input buffer
received record layer msg
got HANDSHAKE
wolfSSL Entering DoHandShakeMsg
wolfSSL Entering DoHandShakeMsgType
processing server hello done
wolfSSL Leaving DoHandShakeMsgType(), return 0
wolfSSL Leaving DoHandShakeMsg(), return 0
connect state: HELLO_AGAIN
connect state: HELLO_AGAIN_REPLY
connect state: FIRST_REPLY_DONE
connect state: FIRST_REPLY_FIRST
wolfSSL Entering SendClientKeyExchange
wolfSSL Entering EccMakeKey
wolfSSL Leaving EccMakeKey, return 0
wolfSSL Entering EccSharedSecret
wolfSSL Leaving EccSharedSecret, return 0
growing output buffer
Shrinking output buffer
wolfSSL Leaving SendClientKeyExchange, return 0
sent: client key exchange
connect state: FIRST_REPLY_SECOND
connect state: FIRST_REPLY_THIRD
growing output buffer
Segmentation fault
SparkiDev commented 1 year ago

Hi @calvin2021y

You can print put the pointer just before the crash. That is, internal.c:26227. fprintf(stderr, "ssl->buffer=%p\n", ssl->buffer);

Thanks, Sean

calvin2021y commented 1 year ago

Thanks for the tips:

ssl->buffer=01030316
SetKeys 2242 heap=00000000        
wc_AesInit heap=00000000, devId=-2
[2550] aes=2f1feb78, 880          
AES1=2f1feb78, 16, 2f1fecb8       
Segmentation fault 
SparkiDev commented 3 weeks ago

Hi @calvin2021y

I'm afraid I don't have any answers for this. We haven't had other customers with a similar issue. Closing ticket.

Sean