Closed chipitsine closed 1 year ago
sorry, it seem to be docker caching ((
Heh. I prefer false alarms over unreported issues :)
In your repo I saw that you enabled HAVE_SSL_KEYLOG. I don't think that's worth doing. We only have a dummy keylog API to avoid patching some OpenBSD ports. This doesn't do anything and we don't really plan on supporting this.
I've double checked, cache is not involved. It's not a false alarm.
in logs I see haproxy version HAProxy version cf950d9b69e307279d61999c05c29fb89073245d 2023/05/17 - https://haproxy.org/
it is https://github.com/chipitsine/haproxy/commit/cf950d9b69e307279d61999c05c29fb89073245d
that means, we already have chacha20_poly1305: https://github.com/chipitsine/haproxy/commit/97c344dae0f75220ffe48949977192f43b9c3d75
meanwhile, there's still
AddressSanitizer:DEADLYSIGNAL
=================================================================
==31==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000010 (pc 0x7f25ad8979e8 bp 0x7fffed029cc0 sp 0x7fffed029c80 T0)
==31==The signal is caused by a WRITE memory access.
==31==Hint: address points to the zero page.
#0 0x7f25ad8979e7 in EVP_CipherInit_ex evp/evp_enc.c:95
#1 0x7f25ad89884d in EVP_DecryptInit_ex evp/evp_enc.c:292
#2 0x556e1983053b in quic_tls_decrypt src/quic_tls.c:564
#3 0x556e1979b18c in qc_pkt_decrypt src/quic_conn.c:1625
#4 0x556e197b6c04 in qc_treat_rx_pkts src/quic_conn.c:4573
#5 0x556e197bbef2 in quic_conn_app_io_cb src/quic_conn.c:5029
#6 0x556e19e1948b in run_tasks_from_lists src/task.c:596
#7 0x556e19e1b27c in process_runnable_tasks src/task.c:876
#8 0x556e19d0eed6 in run_poll_loop src/haproxy.c:2968
#9 0x556e19d0fd70 in run_thread_poll_loop src/haproxy.c:3167
#10 0x556e19d12bcd in main src/haproxy.c:3827
#11 0x7f25ad4320b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x240b2)
#12 0x556e196e3dbd in _start (/usr/local/sbin/haproxy+0x1b0dbd)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV evp/evp_enc.c:95 in EVP_CipherInit_ex
==31==ABORTING
(I'll drop latest commit, but currently it helps me to run QUIC Interop, because it's configuration requires key logging)
server | HAProxy version cf950d9b69e307279d61999c05c29fb89073245d 2023/05/17 - https://haproxy.org/
server | Status: development branch - not safe for use in production.
server | Known bugs: https://github.com/haproxy/haproxy/issues?q=is:issue+is:open
server | Running on: Linux 6.2.15-300.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Thu May 11 17:37:39 UTC 2023 x86_64
server | Build options :
server | TARGET = linux-glibc
server | CPU = generic
server | CC = gcc
server | CFLAGS = -pg -O0 -g -Wno-deprecated-declarations -fsanitize=address -Wall -Wextra -Wundef -Wdeclaration-after-statement -Wfatal-errors -Wtype-limits -Wshift-negative-value -Wshift-overflow=2 -Wduplicated-cond -Wnull-dereference -fwrapv -Wno-address-of-packed-member -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-clobbered -Wno-missing-field-initializers -Wno-cast-function-type -Wno-string-plus-int -Wno-atomic-alignment -Werror
server | OPTIONS = USE_OPENSSL=1 USE_LUA=1 USE_OBSOLETE_LINKER=1 USE_QUIC=1
server | DEBUG = -DDEBUG_DONT_SHARE_POOLS -DDEBUG_MEMORY_POOLS -DDEBUG_STRICT=2 -DDEBUG_TASK -DDEBUG_FAIL_ALLOC
server |
server | Feature list : -51DEGREES +ACCEPT4 +BACKTRACE -CLOSEFROM +CPU_AFFINITY +CRYPT_H -DEVICEATLAS +DL -ENGINE +EPOLL -EVPORTS +GETADDRINFO -KQUEUE -LIBATOMIC +LIBCRYPT +LINUX_SPLICE +LINUX_TPROXY +LUA +MATH -MEMORY_PROFILING +NETFILTER +NS +OBSOLETE_LINKER +OPENSSL -OPENSSL_WOLFSSL -OT -PCRE -PCRE2 -PCRE2_JIT -PCRE_JIT +POLL +PRCTL -PROCCTL -PROMEX -PTHREAD_EMULATION +QUIC +RT +SHM_OPEN +SLZ +SSL -STATIC_PCRE -STATIC_PCRE2 -SYSTEMD +TFO +THREAD +THREAD_DUMP +TPROXY -WURFL -ZLIB
server |
server | Default settings :
server | bufsize = 16384, maxrewrite = 1024, maxpollevents = 200
server |
server | Built with multi-threading support (MAX_TGROUPS=16, MAX_THREADS=256, default=2).
server | Built with gcc compiler version 9.4.0 with address sanitizer
server | Encrypted password support via crypt(3): yes
server | Built without PCRE or PCRE2 support (using libc's regex instead)
server | Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
server | Built with libslz for stateless compression.
server | Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
server | Running with a replaced memory allocator (e.g. via LD_PRELOAD).
server | Built with network namespace support.
server | Built with Lua version : Lua 5.3.3
server | Built with OpenSSL version : LibreSSL 3.8.0
server | Running on OpenSSL version : LibreSSL 3.8.0
server | OpenSSL library supports TLS extensions : yes
server | OpenSSL library supports SNI : yes
server | OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3
Thanks. I can only state the obvious so far: for some reason haproxy calls EVP_DecryptInit_ex() with an EVP_CIPHER_CTX of NULL since rx_ctx is NULL in qc_pkt_decrypt(). It means an assumption is violated somewhere.
Since you told us this is LibreSSL specific this can be misbehavior or a bug in LibreSSL that messes up haproxy's internal state tracking, some race because we don't have some locking that haproxy expects us to have, or a bug in haproxy, or something else entirely. I can't currently tell.
We will need to have a good reproducer to dig deeper into this.
Unfortunately, after doing the reproduction steps, python run.py
stops
immediately with:
Saving logs to logs_2023-05-25T00:22:18.
haproxy client not compliant.
Not compliant, skipping
Run took 0:00:00.415228
+---------+---------+
| | haproxy |
+---------+---------+
| quic-go | |
| | |
| | |
+---------+---------+
+---------+---------+
| | haproxy |
+---------+---------+
| quic-go | |
+---------+---------+
Does that ring a bell?
quic interop is niche tool. it set up several docker containers using docker-compose. I was able to run it on Fedora.
when I try it on Ubuntu 20.04 and 22.04 (I ran it as python run.py --debug
)
Creating network "quic-interop-runner_leftnet" with driver "bridge"
Creating network "quic-interop-runner_rightnet" with driver "bridge"
Removing sim
Recreating 476b3b4333b1_sim ...
Recreating 476b3b4333b1_sim ... done
Recreating client ...
Recreating client ... done
Attaching to sim, client
client | Setting up routes...
client | Actual changes:
client | tx-checksumming: off
client | tx-checksum-ip-generic: off
client | tx-checksum-sctp: off
client | tcp-segmentation-offload: off
client | tx-tcp-segmentation: off [requested on]
client | tx-tcp-ecn-segmentation: off [requested on]
client | tx-tcp-mangleid-segmentation: off [requested on]
client | tx-tcp6-segmentation: off [requested on]
client | Endpoint's IPv4 address is 193.167.0.100
client | Endpoint's IPv6 address is fd00:cafe:cafe::100
sim | ip6tables v1.8.4 (legacy): can't initialize ip6tables table `filter': Table does not exist (do you need to insmod?)
sim | Perhaps ip6tables or your kernel needs to be upgraded.
sim exited with code 3
3
Aborting on container exit...
Not compliant, skipping
I couldn't figure out how to deal with that ((
I'll try to get it running on Ubuntu
(this looks like a candidate https://stackoverflow.com/questions/58726984/docker-errno-2-ip6tables-v1-6-1-cant-initialize-ip6tables-table-filter-ta )
@botovq ,
I confirm that modprobe ip6table_filter
did a trick on Ubuntu.
QUIC Interop perform about a dozen of tests, however only couple of them fail
logs_2023-05-26T20:10:28/haproxy_aioquic/handshakecorruption/server/log.txt:SUMMARY: AddressSanitizer: SEGV evp/evp_enc.c:95 in EVP_CipherInit_ex
logs_2023-05-26T20:10:28/haproxy_aioquic/transfercorruption/server/log.txt:SUMMARY: AddressSanitizer: SEGV evp/evp_enc.c:95 in EVP_CipherInit_ex
those are "corruption" test with high rate of renegotiations, looks interesting
class TestCaseHandshakeCorruption(TestCaseHandshakeLoss):
@staticmethod
def name():
return "handshakecorruption"
@staticmethod
def abbreviation():
return "C1"
@staticmethod
def desc():
return "Handshake completes under extreme packet corruption."
@staticmethod
def scenario() -> str:
""" Scenario for the ns3 simulator """
return "corrupt-rate --delay=15ms --bandwidth=10Mbps --queue=25 --rate_to_server=30 --rate_to_client=30"
I've run tests again, seems there's another regression.
AddressSanitizer:DEADLYSIGNAL
=================================================================
==28==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f0b906c9d78 bp 0x7f0b8d1f24e0 sp 0x7f0b8d1f1c48 T2)
==28==The signal is caused by a READ memory access.
==28==Hint: address points to the zero page.
#0 0x7f0b906c9d77 (/lib/x86_64-linux-gnu/libc.so.6+0xbbd77)
#1 0x7f0b9109b37e in __interceptor_memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:790
#2 0x7f0b90ace889 in pkey_hkdf_ctrl kdf/hkdf_evp.c:138
#3 0x7f0b90aadb90 in EVP_PKEY_CTX_ctrl evp/pmeth_lib.c:365
#4 0x55e525543b43 in quic_hkdf_expand src/quic_tls.c:129
#5 0x55e525544262 in quic_hkdf_expand_label src/quic_tls.c:224
#6 0x55e525544981 in quic_tls_sec_update src/quic_tls.c:320
#7 0x55e5254a7959 in quic_tls_key_update src/quic_conn.c:942
#8 0x55e5254b81ae in qc_provide_cdata src/quic_conn.c:2509
#9 0x55e5254bd974 in qc_handle_crypto_frm src/quic_conn.c:2976
#10 0x55e5254bfa33 in qc_parse_pkt_frms src/quic_conn.c:3203
#11 0x55e5254ca8d3 in qc_treat_rx_pkts src/quic_conn.c:4597
#12 0x55e5254d0a4b in quic_conn_io_cb src/quic_conn.c:5135
#13 0x55e525b3363e in run_tasks_from_lists src/task.c:596
#14 0x55e525b3542f in process_runnable_tasks src/task.c:876
#15 0x55e525a28e66 in run_poll_loop src/haproxy.c:2968
#16 0x55e525a29d00 in run_thread_poll_loop src/haproxy.c:3167
#17 0x7f0b90d6e608 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x8608)
#18 0x7f0b9072d162 in __clone (/lib/x86_64-linux-gnu/libc.so.6+0x11f162)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/lib/x86_64-linux-gnu/libc.so.6+0xbbd77)
Thread T2 created by T0 here:
#0 0x7f0b9103a815 in __interceptor_pthread_create ../../../../src/libsanitizer/asan/asan_interceptors.cc:208
#1 0x55e525c73e0e in setup_extra_threads src/thread.c:252
#2 0x55e525a2cd26 in main src/haproxy.c:3832
#3 0x7f0b906320b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x240b2)
@botovq , I've compared OpenSSL source, it allows null key
https://github.com/quictls/openssl/blob/OpenSSL_1_1_1s%2Bquic/crypto/kdf/hkdf.c#L113
should LibreSSL contain "p2" validation against NULL as well ?
currently LibreSSL is using memcpy instead of OPENSSL_memdup
Thanks, good catch. We do not have OPENSSL_memdup()
. It was missed in the conversion that it was NULL
-safe (and the other cases in the switch have explicit checks...).
The below should align us with OpenSSL behavior, except that we don't leave the kctx->key_len
inconsistent with kctx->key
. I will see about landing something like this.
Index: kdf/hkdf_evp.c
===================================================================
RCS file: /cvs/src/lib/libcrypto/kdf/hkdf_evp.c,v
retrieving revision 1.19
diff -u -p -r1.19 hkdf_evp.c
--- kdf/hkdf_evp.c 26 Nov 2022 16:08:53 -0000 1.19
+++ kdf/hkdf_evp.c 25 Jun 2023 21:13:17 -0000
@@ -129,10 +129,17 @@ pkey_hkdf_ctrl(EVP_PKEY_CTX *ctx, int ty
return 1;
case EVP_PKEY_CTRL_HKDF_KEY:
- if (p1 <= 0)
+ if (p1 < 0)
return 0;
freezero(kctx->key, kctx->key_len);
+ kctx->key = NULL;
+ kctx->key_len = 0;
+
+ /* Match OpenSSL's behavior. */
+ if (p1 == 0 || p2 == NULL)
+ return 0;
+
if ((kctx->key = malloc(p1)) == NULL)
return 0;
memcpy(kctx->key, p2, p1);
Committed and pushed to github: https://github.com/libressl/openbsd/commit/ac48eb74f5d0f870c63d096e150a1c05a42b8bf4
great, I ran tests, now I see
==28==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000010 (pc 0x7fddcff310f1 bp 0x7fff3c12bd60 sp 0x7fff3c12bd20 T0)
==28==The signal is caused by a WRITE memory access.
==28==Hint: address points to the zero page.
#0 0x7fddcff310f0 in EVP_CipherInit_ex evp/evp_enc.c:95
#1 0x7fddcff31f56 in EVP_DecryptInit_ex evp/evp_enc.c:292
#2 0x55c30d333645 in quic_tls_decrypt src/quic_tls.c:564
#3 0x55c30d29cee0 in qc_pkt_decrypt src/quic_conn.c:1650
#4 0x55c30d2b87cf in qc_treat_rx_pkts src/quic_conn.c:4591
#5 0x55c30d2bdaa7 in quic_conn_app_io_cb src/quic_conn.c:5047
#6 0x55c30d92163e in run_tasks_from_lists src/task.c:596
#7 0x55c30d92342f in process_runnable_tasks src/task.c:876
#8 0x55c30d816e66 in run_poll_loop src/haproxy.c:2968
#9 0x55c30d817d00 in run_thread_poll_loop src/haproxy.c:3167
#10 0x55c30d81ad37 in main src/haproxy.c:3838
#11 0x7fddcf90f0b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x240b2)
#12 0x55c30d1e524d in _start (/usr/local/sbin/haproxy+0x1b224d)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV evp/evp_enc.c:95 in EVP_CipherInit_ex
==28==ABORTING
This looks like our old friend again. I asked @4a6f656c to take a look.
I do not make friendship with bugs, however I agree we seen it already
I re-ran test suite, only this caught. seems original issue is gone
not reproduced anymore.
LibreSSL - master branch haproxy - https://github.com/chipitsine/haproxy caught by running QUIC Interop suite
repro steps: