noxxi / p5-io-socket-ssl

IO::Socket::SSL Perl Module
36 stars 60 forks source link

Version 2.068, 2.069, 2.070: t/sni_verify.t seems to hang on Strawberry Perl 5.30.2 64 bit #97

Closed twata1 closed 1 year ago

twata1 commented 3 years ago

Hello,

It seems to me that t/sni_verify.t hangs as follows.

Windows 8.1, Strawberry Perl 5.30.2 (64bit)

C:\home\sunlight2\IO-Socket-SSL-2.070>perl Makefile.PL
Should I do external tests?
These test will detect if there are network problems and fail soft,
so please disable them only if you definitely don't want to have any
network traffic to external sites.  [Y/n] Y
Checking if your kit is complete...
Looks good
Generating a gmake-style Makefile
Writing Makefile for IO::Socket::SSL
Writing MYMETA.yml and MYMETA.json

C:\home\sunlight2\IO-Socket-SSL-2.070>gmake
cp lib/IO/Socket/SSL/Utils.pm blib\lib\IO\Socket\SSL\Utils.pm
cp lib/IO/Socket/SSL.pm blib\lib\IO\Socket\SSL.pm
cp lib/IO/Socket/SSL.pod blib\lib\IO\Socket\SSL.pod
cp lib/IO/Socket/SSL/Intercept.pm blib\lib\IO\Socket\SSL\Intercept.pm
cp lib/IO/Socket/SSL/PublicSuffix.pm blib\lib\IO\Socket\SSL\PublicSuffix.pm

C:\home\sunlight2\IO-Socket-SSL-2.070>gmake test
"C:\Strawberry\perl\bin\perl.exe" "-MExtUtils::Command::MM" "-MTest::Harness" "-e" "undef *Test::Harness::Switches; test_harness(0, 'blib\lib', 'blib\arch')" t/*.t t/external/*.t
t/01loadmodule.t .................. 1/3 # openssl version compiled=0x1010104f linked=0x1010104f -- OpenSSL 1.1.1d  10 Sep 2019
# Net::SSLeay version=1.90
# parent IO::Socket::IP version=0.39
t/01loadmodule.t .................. ok
t/acceptSSL-timeout.t ............. ok
t/alpn.t .......................... ok
t/auto_verify_hostname.t .......... ok
t/cert_formats.t .................. ok
t/cert_no_file.t .................. ok
t/compatibility.t ................. ok
t/connectSSL-timeout.t ............ ok
t/core.t .......................... ok
t/dhe.t ........................... ok
t/ecdhe.t ......................... ok
t/external/ocsp.t ................. # tcp connect to www.chksum.de:443 ok
# fingerprint matches
# validation with default CA w/o OCSP ok
# got stapled response as expected
# validation with default CA with OCSP defaults ok
# validation with default CA with OCSP full chain ok
t/external/ocsp.t ................. 1/3 # tcp connect to www.bild.de:443 ok
# tcp connect to revoked.grc.com:443 ok
t/external/ocsp.t ................. ok
t/external/usable_ca.t ............ # found 138 CA certs
# have root CA for www.bild.de in store
# 5 connections to www.bild.de ok
t/external/usable_ca.t ............ 1/21 # have root CA for www.yahoo.com in store
# 5 connections to www.yahoo.com ok
# have root CA for www.comdirect.de in store
# 5 connections to www.comdirect.de ok
t/external/usable_ca.t ............ 7/21 # have root CA for meine.deutsche-bank.de in store
# 5 connections to meine.deutsche-bank.de ok
t/external/usable_ca.t ............ 10/21 # have root CA for www.twitter.com in store
# 5 connections to www.twitter.com ok
# have root CA for www.facebook.com in store
# 5 connections to www.facebook.com ok
# have root CA for www.live.com in store
# 5 connections to www.live.com ok
t/external/usable_ca.t ............ ok
t/io-socket-inet6.t ............... ok
t/io-socket-ip.t .................. ok
t/memleak_bad_handshake.t ......... skipped: ps not implemented on this platform
t/mitm.t .......................... ok
t/multiple-cert-rsa-ecc.t ......... ok
t/nonblock.t ...................... ok
t/npn.t ........................... ok
t/plain_upgrade_downgrade.t ....... # -- test: newINET start_SSL stop_SSL start_SSL
# server accepted new client
# wait for initial data from client
# got 0x666f6f from client
# server: got plain data at start of connection
# server: TLS upgrade
t/plain_upgrade_downgrade.t ....... 1/15 # server: TLS downgrade
# server: TLS upgrade#2
# -- test: newSSL stop_SSL connect_SSL
# server accepted new client
# wait for initial data from client
# got 0x160301 from client
# server: TLS upgrade
# server: TLS downgrade
# server: TLS upgrade#2
# -- test: newSSL:0 connect_SSL stop_SSL connect_SSL
# server accepted new client
# wait for initial data from client
# got 0x666f6f from client
# server: got plain data at start of connection
# server: TLS upgrade
# server: TLS downgrade
# server: TLS upgrade#2
# -- test: newSSL:0 start_SSL stop_SSL connect_SSL
# server accepted new client
# wait for initial data from client
# got 0x666f6f from client
# server: got plain data at start of connection
# server: TLS upgrade
# server: TLS downgrade
# server: TLS upgrade#2
# server accepted new client
# wait for initial data from client
# got 0x656e64 from client
# client requested end of tests
t/plain_upgrade_downgrade.t ....... ok
t/protocol_version.t .............. 1/? # looks like OpenSSL was compiled without SSLv3 support
t/protocol_version.t .............. ok
t/public_suffix_lib_encode_idn.t .. ok
t/public_suffix_lib_libidn.t ...... ok
t/public_suffix_lib_uri.t ......... ok
t/public_suffix_ssl.t ............. ok
t/readline.t ...................... ok
t/session_cache.t ................. ok
t/session_ticket.t ................ # listen at 127.0.0.1:50556
# listen at 127.0.0.1:50557
# connect to 0: success reuse=0 version=TLSv1_3
t/session_ticket.t ................ 1/6 # connect to 0: success reuse=1 version=TLSv1_3
# connect to 1: success reuse=1 version=TLSv1_3
# connect to 1: success reuse=0 version=TLSv1_3
# connect to 0: success reuse=0 version=TLSv1_3
# connect to 0: success reuse=1 version=TLSv1_3
t/session_ticket.t ................ ok
t/sessions.t ...................... ok
t/set_curves.t .................... ok
t/signal-readline.t ............... skipped: signals not relevant on this platform
t/sni.t ........................... ok
t/sni_verify.t .................... 1/17
(hangs)

Thank you,

noxxi commented 3 years ago

This is an issue I get only on Windows systems and I'm not sure what causes the problem. I have theories about differences between the fork-emulation in Perl on Windows vs. real fork on other platforms, but also about antivirus (which is also specific to Windows) inferring with the tests. Can you therefore please verify that

If both of these is the case (i.e. problem reproducible and not a side effect of antivirus) please run this specific tests with perl -Mblib t/sni_verify.t to find out a) if it still hangs when called this way and b) where exactly it hangs.

twata1 commented 3 years ago

Thank you for your reply.

Today, I confirmed version 2.070 on Strawberry Perl 5.30.2 64bit under Windows 8.1.

It seems to me that perl -Mblib t/sni_verify.t does not hang.

C:\home\sunlight2\IO-Socket-SSL-2.070>perl -Mblib t/sni_verify.t
1..17
ok # Server Initialization
ok # client ssl connect server.local
ok # client verify hostname in cert server.local
ok # server accept
ok # server got SNI name server.local
ok # client ssl connect server2.local
ok # client verify hostname in cert server2.local
ok # server accept
ok # server got SNI name server2.local
ok # client ssl connect smtp.mydomain.local
ok # client verify hostname in cert smtp.mydomain.local
ok # server accept
ok # server got SNI name smtp.mydomain.local
ok # client ssl connect www13.other.local
ok # client verify hostname in cert www13.other.local
ok # server accept
ok # server got SNI name www13.other.local

C:\home\sunlight2\IO-Socket-SSL-2.070>

But prove -lv t/sni_verify.t seems to hang.

C:\home\sunlight2\IO-Socket-SSL-2.070>prove -lv t/sni_verify.t
t/sni_verify.t ..
1..17
ok # Server Initialization
(hangs)

So I just thought there was something wrong with the gmake test and prove commands.

Thank you,

bckrk commented 3 years ago

I have the similar issue with v2.071 on Strawberry Perl 5.32.1 64bit. The t/sni_verify.t test hangs and does not complete.

Is there any solution to this ?

twata1 commented 3 years ago

I have just made a pull request to resolve this issue (#105).

h-vn commented 2 years ago

This looks a bit similar to failures with older Perls with Net::SSLeay tests that use forks and then do socket calls.

https://github.com/radiator-software/p5-net-ssleay/issues/356#issuecomment-1025741806

In short: https://perldoc.perl.org/5.20.0/perldelta

On Windows, perl no longer calls CloseHandle() on a socket handle. This makes debugging easier on Windows by removing certain irrelevant bad handle exceptions. It also fixes a race condition that made socket functions randomly fail in a Perl process with multiple OS threads, and possible test failures in dist/IO/t/cachepropagate-tcp.t. [perl #120091/118059]

For issue 118059, see https://github.com/Perl/perl5/issues/12979

In Net::SSLeay case sleep(1) seems to help.

jddurand commented 2 years ago

Using Net::SSLeay 1.93_01 and openssl 3.0.5 I do not have any hang but almost systematic failure with t/sni_verify.t.

After:

    for my $host (@tests) {
        my $client = IO::Socket::SSL->new(...);
        # etc
        print "Doing \$client = undef\n";
        $client = undef;
    }

It seems the server is trying to send packets, like "new session" protocol to the client, after it has successfuly connected (I guess this can happen reading https://www.openssl.org/docs/manmaster/man3/SSL_get_session.htm).

But if the client runs out of scope too fast in t/sni_verify.t, which is anticipated explicitly with $client = undef, the server's accept will fail, c.f. this debug log where I have added # Client and # Server annotations at the very end:

1..17
DEBUG: lib/IO/Socket/SSL.pm:2969: new ctx 20027384
ok # Server Initialization
DEBUG: lib/IO/Socket/SSL.pm:973: no socket yet
DEBUG: lib/IO/Socket/SSL.pm:2969: new ctx 65480352
DEBUG: lib/IO/Socket/SSL.pm:704: socket not yet connected
DEBUG: lib/IO/Socket/SSL.pm:706: socket connected
DEBUG: lib/IO/Socket/SSL.pm:729: ssl handshake not started
DEBUG: lib/IO/Socket/SSL.pm:975: accept created normal socket IO::Socket::SSL=GLOB(0x3408d6c)
DEBUG: lib/IO/Socket/SSL.pm:771: using SNI with hostname server.local
DEBUG: lib/IO/Socket/SSL.pm:1003: starting sslifying
DEBUG: lib/IO/Socket/SSL.pm:806: request OCSP stapling
DEBUG: lib/IO/Socket/SSL.pm:836: call Net::SSLeay::connect
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.1 (OUT), TLS header, Certificate Status (22)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.3 (OUT), TLS handshake, Client hello (1)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.1 (IN), TLS header, Certificate Status (22)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.3 (IN), TLS handshake, Client hello (1)
DEBUG: lib/IO/Socket/SSL.pm:2953: set context from servername server.local
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.2 (OUT), TLS header, Certificate Status (22)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.3 (OUT), TLS handshake, Server hello (2)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.2 (OUT), TLS header, Finished (20)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.2 (OUT), TLS header, Supplemental data (23)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.3 (OUT), TLS handshake, Encrypted Extensions (8)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.2 (OUT), TLS header, Supplemental data (23)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.3 (OUT), TLS handshake, Request CERT (13)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.2 (OUT), TLS header, Supplemental data (23)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.3 (OUT), TLS handshake, Certificate (11)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.2 (OUT), TLS header, Supplemental data (23)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.3 (OUT), TLS handshake, CERT verify (15)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.2 (OUT), TLS header, Supplemental data (23)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.3 (OUT), TLS handshake, Finished (20)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.2 (IN), TLS header, Certificate Status (22)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.3 (IN), TLS handshake, Server hello (2)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.2 (IN), TLS header, Finished (20)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.2 (IN), TLS header, Supplemental data (23)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.2 (IN), TLS header, Supplemental data (23)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.3 (IN), TLS handshake, Request CERT (13)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.2 (IN), TLS header, Supplemental data (23)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.3 (IN), TLS handshake, Certificate (11)
DEBUG: lib/IO/Socket/SSL.pm:2813: ok=1 [1] /CN=IO::Socket::SSL Demo CA/CN=IO::Socket::SSL Demo CA
DEBUG: lib/IO/Socket/SSL.pm:2813: ok=1 [0] /CN=IO::Socket::SSL Demo CA/CN=server.local
DEBUG: lib/IO/Socket/SSL.pm:1786: scheme=default cert=63258200
DEBUG: lib/IO/Socket/SSL.pm:1796: identity=server.local cn=server.local alt=
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.2 (IN), TLS header, Supplemental data (23)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.3 (IN), TLS handshake, CERT verify (15)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.2 (IN), TLS header, Supplemental data (23)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.3 (IN), TLS handshake, Finished (20)
DEBUG: lib/IO/Socket/SSL.pm:2860: did not get stapled OCSP response
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.2 (OUT), TLS header, Finished (20)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.2 (OUT), TLS header, Supplemental data (23)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.3 (OUT), TLS handshake, Certificate (11)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.2 (OUT), TLS header, Supplemental data (23)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.3 (OUT), TLS handshake, CERT verify (15)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.2 (OUT), TLS header, Supplemental data (23)
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.3 (OUT), TLS handshake, Finished (20)
DEBUG: lib/IO/Socket/SSL.pm:839: done Net::SSLeay::connect -> 1
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.2 (IN), TLS header, Finished (20)
DEBUG: lib/IO/Socket/SSL.pm:894: ssl handshake done
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.2 (IN), TLS header, Supplemental data (23)
ok # client ssl connect server.local
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.3 (IN), TLS handshake, Certificate (11)
DEBUG: lib/IO/Socket/SSL.pm:1786: scheme=http cert=63258200
DEBUG: lib/IO/Socket/SSL.pm:1796: identity=server.local cn=server.local alt=
ok # client verify hostname in cert server.local
Doing $client = undef
DEBUG: lib/IO/Socket/SSL.pm:3018: free ctx 65480352 open=65480352   # Client
DEBUG: lib/IO/Socket/SSL.pm:3022: free ctx 65480352 callback # Client
DEBUG: lib/IO/Socket/SSL.pm:3029: OK free ctx 65480352 # Client
DEBUG: lib/IO/Socket/SSL.pm:3018: free ctx 20027384 open= # Client
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.2 (IN), TLS header, Supplemental data (23) # Server
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.3 (IN), TLS handshake, CERT verify (15) # Server
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.2 (IN), TLS header, Supplemental data (23) # Server
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.3 (IN), TLS handshake, Finished (20) # Server
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.2 (OUT), TLS header, Supplemental data (23) # Server
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.3 (OUT), TLS handshake, Newsession Ticket (4) # Server
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.2 (OUT), TLS header, Supplemental data (23) # Server
DEBUG: lib/IO/Socket/SSL.pm:3584: * TLSv1.3 (OUT), TLS handshake, Newsession Ticket (4) # Server
DEBUG: lib/IO/Socket/SSL.pm:1051: Net::SSLeay::accept -> -1 # Server
DEBUG: lib/IO/Socket/SSL.pm:1054: local error: SSL accept attempt failed # Server
not ok # server accept - SSL accept attempt failed # Server
ok # skip accept failed
DEBUG: lib/IO/Socket/SSL.pm:3018: free ctx 20027384 open=55351576 19735448 20027384 55380128 # Server
DEBUG: lib/IO/Socket/SSL.pm:3022: free ctx 20027384 callback # Server
DEBUG: lib/IO/Socket/SSL.pm:3029: OK free ctx 20027384 # Server

Trying with the 'sleep' to prevent $client running out of scope works, this is is a way to make the client life longer enough so that server can send the other packets, but this is not very nice ;) I guess there is a nicer way to prevent this race condition (?)

Hoping my analysis is correct (I am not ssl expert).

noxxi commented 1 year ago

Hoping my analysis is correct (I am not ssl expert).

Nice analysis. You tracing feature is really useful here. Could you please try the following patch to t/sni_verify.t which tries to deal with the race by making the client wait for server application data?

diff --git a/t/sni_verify.t b/t/sni_verify.t
index 81841e2..b6923cf 100644
--- a/t/sni_verify.t
+++ b/t/sni_verify.t
@@ -75,6 +75,8 @@ if ( $pid == 0 ) {
            print "not ok # client ssl connect $host - $SSL_ERROR\n";
            print "ok # skip connect failed\n";
        }
+       # wait for server to send something to make sure finished accept
+       <$client>;
     }
     exit;
 }
@@ -86,6 +88,7 @@ for my $host (@tests) {
        my $name = $csock->get_servername;
        print "not " if ! $name or $name ne $host;
        print "ok # server got SNI name $host\n";
+       print $csock "hi\n";
     } else {
        print "not ok # server accept - $SSL_ERROR\n";
        print "ok # skip accept failed\n";
jddurand commented 1 year ago

You got it, many thanks - t/sni_verify.t always succeed with this patch ;)

jddurand commented 1 year ago

If you don't mind I suggest to do the same in other tests that are subject to scope issue (just hitted once an error in sni.t for instance that I fail to reproduce).

jddurand commented 1 year ago

Hello,

Please find a patch against version 2.076 for these three cases. IO-Socket-SSL-2.076.txt