noxxi / p5-io-socket-ssl

IO::Socket::SSL Perl Module
36 stars 59 forks source link

Test set_curves.t appears to be sensitive to timing #113

Closed h-vn closed 2 years ago

h-vn commented 2 years ago

Relates to this failure: http://www.cpantesters.org/cpan/report/6f7978da-74ed-11ec-a288-2c8921f47948

I did some further testing on macOS 10.15.7 to see if I can reproduce it. What test set_curves.t does seems to be an on/off thing: I'd expect it to always fail or always succeed:

Perl is a perlbrew 5.12.5 and this is what 01loadmodule.t prints:

% prove -lbv t/01loadmodule.t 
t/01loadmodule.t .. 
1..3
ok 1 - loaded
# openssl version compiled=0x30000010 linked=0x30000010 -- OpenSSL 3.0.1 14 Dec 2021
# Net::SSLeay version=1.92
# parent IO::Socket::IP version=0.41
ok 2 - IO::Socket::SSL::DEBUG 1
ok 3 - Net::SSLeay::trace 1
ok

To reproduce I put load on the host by starting a number of cat /dev/zero > /dev/null instances and then running just set_curves.t in a loop while grepping 'not ok' from the output. After running it a while, maybe a minute or even less, I got the following:

not ok # expect success P-384:P-521: SSL accept attempt failed
not ok # expect success P-384:P-521: SSL accept attempt failed
not ok # expect success P-384:P-521: SSL accept attempt failed
not ok # expect success P-384:P-521: SSL accept attempt failed
not ok # expect success P-384: SSL accept attempt failed
not ok # expect success P-384: SSL accept attempt failed
not ok # expect success P-384: SSL accept attempt failed
not ok # expect success P-256:P-384:P-521: SSL accept attempt failed

My guess is that the forked processes race so that one of them is not ready when the other expects it to be.

Update: happens with perlbrew Perl 5.34.0 too. Maybe not as fast, but every once and while and with the same errors.

noxxi commented 2 years ago

Thanks for the report. Based on how the test is designed I don't think there is a race with process forking: the server socket is setup and listening before any process gets forked, the server socket stays in the parent process and only the client is forked. It is checked that the fork works so the client should be running.

Could you please run the test with perl -MIO::Socket::SSL=debug4 t/set_curves.t so that it shows more debugging information on what is happening at the TLS level? Also it would be useful to have some timing about the failures, i.e. does the test fail because of some timeouts (resource shortage, scheduler problems?) or immediately because of some other errors.

karenetheridge commented 2 years ago

I can't get the test to fail again in the same environment, but:

: [ether@jaeger 1642193867.97809/IO-Socket-SSL-2.074]$; which perl
/Volumes/amaretto/Users/ether/perl5/perlbrew/perls/12.5/bin/perl
: [ether@jaeger 1642193867.97809/IO-Socket-SSL-2.074]$; perl -Mblib -MIO::Socket::SSL -wle'print $INC{"IO/Socket/SSL.pm"}'
/Volumes/amaretto/Users/ether/.cpanm/work/1642193867.97809/IO-Socket-SSL-2.074/blib/lib/IO/Socket/SSL.pm
: [ether@jaeger 1642193867.97809/IO-Socket-SSL-2.074]$; perl -Mblib -MIO::Socket::SSL=debug4 t/set_curves.t
1..6
DEBUG: .../IO/Socket/SSL.pm:2961: new ctx 140362962215424
ok # Server Initialization
DEBUG: .../IO/Socket/SSL.pm:973: no socket yet
DEBUG: .../IO/Socket/SSL.pm:2961: new ctx 140362911959552
DEBUG: .../IO/Socket/SSL.pm:704: socket not yet connected
DEBUG: .../IO/Socket/SSL.pm:706: socket connected
DEBUG: .../IO/Socket/SSL.pm:729: ssl handshake not started
DEBUG: .../IO/Socket/SSL.pm:975: accept created normal socket IO::Socket::SSL=GLOB(0x7fa8cb8bacc8)
DEBUG: .../IO/Socket/SSL.pm:1003: starting sslifying
DEBUG: .../IO/Socket/SSL.pm:774: not using SNI because hostname is unknown
DEBUG: .../IO/Socket/SSL.pm:806: request OCSP stapling
DEBUG: .../IO/Socket/SSL.pm:836: call Net::SSLeay::connect
DEBUG: .../IO/Socket/SSL.pm:2805: ok=1 [1] /CN=IO::Socket::SSL Demo CA/CN=IO::Socket::SSL Demo CA
DEBUG: .../IO/Socket/SSL.pm:2805: ok=1 [0] /CN=IO::Socket::SSL Demo CA/CN=server.local
DEBUG: .../IO/Socket/SSL.pm:2852: did not get stapled OCSP response
DEBUG: .../IO/Socket/SSL.pm:839: done Net::SSLeay::connect -> 1
DEBUG: .../IO/Socket/SSL.pm:894: ssl handshake done
DEBUG: .../IO/Socket/SSL.pm:3010: free ctx 140362911959552 open=
DEBUG: .../IO/Socket/SSL.pm:3014: free ctx 140362911959552 callback
DEBUG: .../IO/Socket/SSL.pm:3021: OK free ctx 140362911959552
DEBUG: .../IO/Socket/SSL.pm:1051: Net::SSLeay::accept -> 1
DEBUG: .../IO/Socket/SSL.pm:1099: handshake done, socket ready
DEBUG: .../IO/Socket/SSL.pm:991: accept_SSL ok
ok # expect success P-521
DEBUG: .../IO/Socket/SSL.pm:973: no socket yet
DEBUG: .../IO/Socket/SSL.pm:2961: new ctx 140362920840192
DEBUG: .../IO/Socket/SSL.pm:704: socket not yet connected
DEBUG: .../IO/Socket/SSL.pm:706: socket connected
DEBUG: .../IO/Socket/SSL.pm:729: ssl handshake not started
DEBUG: .../IO/Socket/SSL.pm:774: not using SNI because hostname is unknown
DEBUG: .../IO/Socket/SSL.pm:975: accept created normal socket IO::Socket::SSL=GLOB(0x7fa8cb8bae78)
DEBUG: .../IO/Socket/SSL.pm:806: request OCSP stapling
DEBUG: .../IO/Socket/SSL.pm:1003: starting sslifying
DEBUG: .../IO/Socket/SSL.pm:836: call Net::SSLeay::connect
DEBUG: .../IO/Socket/SSL.pm:2805: ok=1 [1] /CN=IO::Socket::SSL Demo CA/CN=IO::Socket::SSL Demo CA
DEBUG: .../IO/Socket/SSL.pm:2805: ok=1 [0] /CN=IO::Socket::SSL Demo CA/CN=server.local
DEBUG: .../IO/Socket/SSL.pm:2852: did not get stapled OCSP response
DEBUG: .../IO/Socket/SSL.pm:839: done Net::SSLeay::connect -> 1
DEBUG: .../IO/Socket/SSL.pm:894: ssl handshake done
DEBUG: .../IO/Socket/SSL.pm:3010: free ctx 140362920840192 open=
DEBUG: .../IO/Socket/SSL.pm:3014: free ctx 140362920840192 callback
DEBUG: .../IO/Socket/SSL.pm:3021: OK free ctx 140362920840192
DEBUG: .../IO/Socket/SSL.pm:1051: Net::SSLeay::accept -> 1
DEBUG: .../IO/Socket/SSL.pm:1099: handshake done, socket ready
DEBUG: .../IO/Socket/SSL.pm:991: accept_SSL ok
ok # expect success P-384
DEBUG: .../IO/Socket/SSL.pm:973: no socket yet
DEBUG: .../IO/Socket/SSL.pm:2961: new ctx 140362920840192
DEBUG: .../IO/Socket/SSL.pm:704: socket not yet connected
DEBUG: .../IO/Socket/SSL.pm:706: socket connected
DEBUG: .../IO/Socket/SSL.pm:729: ssl handshake not started
DEBUG: .../IO/Socket/SSL.pm:975: accept created normal socket IO::Socket::SSL=GLOB(0x7fa8cb8bafb0)
DEBUG: .../IO/Socket/SSL.pm:1003: starting sslifying
DEBUG: .../IO/Socket/SSL.pm:774: not using SNI because hostname is unknown
DEBUG: .../IO/Socket/SSL.pm:806: request OCSP stapling
DEBUG: .../IO/Socket/SSL.pm:836: call Net::SSLeay::connect
DEBUG: .../IO/Socket/SSL.pm:1051: Net::SSLeay::accept -> -1
DEBUG: .../IO/Socket/SSL.pm:839: done Net::SSLeay::connect -> -1
DEBUG: .../IO/Socket/SSL.pm:1054: SSL accept attempt failed

DEBUG: .../IO/Socket/SSL.pm:842: SSL connect attempt failed

DEBUG: .../IO/Socket/SSL.pm:1054: local error: SSL accept attempt failed error:0A000065:SSL routines::no suitable key share
DEBUG: .../IO/Socket/SSL.pm:842: local error: SSL connect attempt failed error:0A000410:SSL routines::sslv3 alert handshake failure
DEBUG: .../IO/Socket/SSL.pm:845: fatal SSL error: SSL connect attempt failed error:0A000410:SSL routines::sslv3 alert handshake failure
ok # expect fail P-256: SSL accept attempt failed error:0A000065:SSL routines::no suitable key share
DEBUG: .../IO/Socket/SSL.pm:973: no socket yet
DEBUG: ...evel/IO/Socket.pm:48: ignoring less severe local error 'IO::Socket::IP configuration failed', keep 'SSL connect attempt failed error:0A000410:SSL routines::sslv3 alert handshake failure'
DEBUG: .../IO/Socket/SSL.pm:3010: free ctx 140362920840192 open=
DEBUG: .../IO/Socket/SSL.pm:3014: free ctx 140362920840192 callback
DEBUG: .../IO/Socket/SSL.pm:3021: OK free ctx 140362920840192
DEBUG: .../IO/Socket/SSL.pm:2961: new ctx 140362946136576
DEBUG: .../IO/Socket/SSL.pm:704: socket not yet connected
DEBUG: .../IO/Socket/SSL.pm:706: socket connected
DEBUG: .../IO/Socket/SSL.pm:729: ssl handshake not started
DEBUG: .../IO/Socket/SSL.pm:975: accept created normal socket IO::Socket::SSL=GLOB(0x7fa8cb8bb0b8)
DEBUG: .../IO/Socket/SSL.pm:1003: starting sslifying
DEBUG: .../IO/Socket/SSL.pm:774: not using SNI because hostname is unknown
DEBUG: .../IO/Socket/SSL.pm:806: request OCSP stapling
DEBUG: .../IO/Socket/SSL.pm:836: call Net::SSLeay::connect
DEBUG: .../IO/Socket/SSL.pm:2805: ok=1 [1] /CN=IO::Socket::SSL Demo CA/CN=IO::Socket::SSL Demo CA
DEBUG: .../IO/Socket/SSL.pm:2805: ok=1 [0] /CN=IO::Socket::SSL Demo CA/CN=server.local
DEBUG: .../IO/Socket/SSL.pm:2852: did not get stapled OCSP response
DEBUG: .../IO/Socket/SSL.pm:839: done Net::SSLeay::connect -> 1
DEBUG: .../IO/Socket/SSL.pm:894: ssl handshake done
DEBUG: .../IO/Socket/SSL.pm:3010: free ctx 140362946136576 open=
DEBUG: .../IO/Socket/SSL.pm:3014: free ctx 140362946136576 callback
DEBUG: .../IO/Socket/SSL.pm:3021: OK free ctx 140362946136576
DEBUG: .../IO/Socket/SSL.pm:1051: Net::SSLeay::accept -> 1
DEBUG: .../IO/Socket/SSL.pm:1099: handshake done, socket ready
DEBUG: .../IO/Socket/SSL.pm:991: accept_SSL ok
ok # expect success P-384:P-521
DEBUG: .../IO/Socket/SSL.pm:973: no socket yet
DEBUG: .../IO/Socket/SSL.pm:2961: new ctx 140362978990080
DEBUG: .../IO/Socket/SSL.pm:704: socket not yet connected
DEBUG: .../IO/Socket/SSL.pm:706: socket connected
DEBUG: .../IO/Socket/SSL.pm:729: ssl handshake not started
DEBUG: .../IO/Socket/SSL.pm:774: not using SNI because hostname is unknown
DEBUG: .../IO/Socket/SSL.pm:975: accept created normal socket IO::Socket::SSL=GLOB(0x7fa8ca08ad68)
DEBUG: .../IO/Socket/SSL.pm:806: request OCSP stapling
DEBUG: .../IO/Socket/SSL.pm:1003: starting sslifying
DEBUG: .../IO/Socket/SSL.pm:836: call Net::SSLeay::connect
DEBUG: .../IO/Socket/SSL.pm:2805: ok=1 [1] /CN=IO::Socket::SSL Demo CA/CN=IO::Socket::SSL Demo CA
DEBUG: .../IO/Socket/SSL.pm:2805: ok=1 [0] /CN=IO::Socket::SSL Demo CA/CN=server.local
DEBUG: .../IO/Socket/SSL.pm:2852: did not get stapled OCSP response
DEBUG: .../IO/Socket/SSL.pm:839: done Net::SSLeay::connect -> 1
DEBUG: .../IO/Socket/SSL.pm:894: ssl handshake done
DEBUG: .../IO/Socket/SSL.pm:3010: free ctx 140362978990080 open=
DEBUG: .../IO/Socket/SSL.pm:3014: free ctx 140362978990080 callback
DEBUG: .../IO/Socket/SSL.pm:3021: OK free ctx 140362978990080
DEBUG: .../IO/Socket/SSL.pm:1051: Net::SSLeay::accept -> 1
DEBUG: .../IO/Socket/SSL.pm:1099: handshake done, socket ready
DEBUG: .../IO/Socket/SSL.pm:991: accept_SSL ok
ok # expect success P-256:P-384:P-521
DEBUG: .../IO/Socket/SSL.pm:3010: free ctx 140362962215424 open=
DEBUG: .../IO/Socket/SSL.pm:3021: OK free ctx 140362962215424
DEBUG: .../IO/Socket/SSL.pm:3010: free ctx 140362962215424 open=
DEBUG: .../IO/Socket/SSL.pm:3021: OK free ctx 140362962215424
h-vn commented 2 years ago

I've attached two files. Both are done on a Mac, again macOS 10.15.7 but a different machine, with perlbrew Perl 5.32.0 and IO::Socket::SSL directly from git:

% git describe --tags
2.074-2-g55a8d7d
% perl -MIO::Socket::SSL=debug4 t/01loadmodule.t                              
1..3
ok 1 - loaded
# openssl version compiled=0x30000010 linked=0x30000010 -- OpenSSL 3.0.1 14 Dec 2021
# Net::SSLeay version=1.92
# parent IO::Socket::IP version=0.39
ok 2 - IO::Socket::SSL::DEBUG 1
ok 3 - Net::SSLeay::trace 1

To create the logs I run the following a number of times manual to catch a failure: perl -MIO::Socket::SSL=debug4 t/set_curves.t 2>&1 | tee log | grep 'not ok'. I then run it once more to get a successful run. I needed to run it a number of times to trigger a failure; about 30, now when I tried it again.

This machine is a current 27 inch iMac which did not have load generation or any type of resource shortage attempted. It was running almost idle, and it's also a lot more quicker than the one I used with my original report. In other words, while additional load may expose the problem quicker, it seems not to be needed.

log-failure.txt log-success.txt

noxxi commented 2 years ago

Thanks for the feedback. Based on the failure logs it looks like the client is closing the socket too fast, i.e. when the client has finished its part of the handshake (connect_SSL) successfully but the server did not finish yet the accept_SSL. This causes accept_SSL to fail. In 8f55de7 I now wait for the client to close the socket on its end before closing the socket in the client. Hopefully this solves the problem.

h-vn commented 2 years ago

The problem seems to be solved. I did a git pull, updated the test command to this: % (while : ; do perl -MIO::Socket::SSL=debug4 t/set_curves.t 2>&1 ; done ) | grep 'not ok', and let it run for about half an hour. No failures were seen.

Looks like this case is closed.

noxxi commented 2 years ago

Thanks for your help.