Open sleipper opened 10 months ago
The underlying pathology of this took some digging. All of the following need to be true:
SslMode=Required
Pooling=False
)This shows up on the Windows side as a warning in the System event log (se attached):
The remote server has requested TLS client authentication, but no suitable client certificate could be found. An anonymous connection will be attempted. This TLS connection request may succeed or fail, depending on the server's policy settings.
I suspect that this very esoteric bug started when this was merged: https://github.com/sysown/proxysql/commit/28f09bfb7c809a94a9b09b25704bb9c3a3624723
which causes this code branch to execute: https://github.com/sysown/proxysql/blob/1f78c9e7f01f1b30afd50b13cdea9db7310a119c/src/proxy_tls.cpp#L423
and the MS client-side TLS caching is behaving very badly in the event that something about the TLS has changed (perhaps getting the tmp context insted of the global one or something - it's impossible to tell without looking at the source code)
After several days of soak-testing, I have a one-line fix for this:
--- a/src/proxy_tls.cpp
+++ b/src/proxy_tls.cpp
@@ -477,7 +477,8 @@ int ProxySQL_create_or_load_TLS(bool bootstrap, std::string& msg) {
}
}
if (ret == 0) {
- SSL_CTX_set_verify(GloVars.global.ssl_ctx, SSL_VERIFY_PEER|SSL_VERIFY_CLIENT_ONCE, callback_ssl_verify_peer);
+ // https://github.com/sysown/proxysql/issues/4419
+ SSL_CTX_set_verify(GloVars.global.ssl_ctx, SSL_VERIFY_NONE, callback_ssl_verify_peer);
}
X509_free(x509);
EVP_PKEY_free(pkey);
In other words, do not perform any peer verification at all (https://www.openssl.org/docs/man1.0.2/man3/SSL_CTX_set_verify.html).
This is benign (currently) because of the current implementation of callback_ssl_verify_peer()
: https://github.com/sysown/proxysql/blob/1f78c9e7f01f1b30afd50b13cdea9db7310a119c/src/proxy_tls.cpp#L70
int callback_ssl_verify_peer(int ok, X509_STORE_CTX* ctx) {
// for now only return 1
return 1;
}
This fix has withstood 48-hours of soak-testing under significant load (ubuntu 20.04 and ubuntu 22.04).
I'll raise a PR
Error "Received an unexpected EOF or 0 bytes from the transport stream" when using ProxySQL with .NET. I've managed to reproduce the bug when there's a large number of async connections. We identified this through running increased parallel integration tests and an element of load testing.
ProxySQL version 2.5.5 using your docker image: proxysql/proxysql:2.5.5.
The upload zip file has 3 tests all of which run a "select * from table" query 25,000 times using the async/TPL features of .NET. It does not have to be 25,000, I've seen it occur with 1000 or 5000 but it's intermittent nature meant it shows with that amount more reliably.
The zip also contains:
Thank you! ProxySqlBug.zip
Thank you!