nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
107.54k stars 29.58k forks source link

sequential/test-tls-psk-client fails on IBM i #44821

Open richardlau opened 2 years ago

richardlau commented 2 years ago

With https://github.com/nodejs/node/pull/44215 we're down to one test that is failing the IBM i CI build:

https://ci.nodejs.org/job/node-test-commit-ibmi/862/nodes=ibmi73-ppc64/console

10:21:56 not ok 3714 sequential/test-tls-psk-client # TODO : Fix flaky test
10:26:56   ---
10:26:56   duration_ms: 300.226
10:26:56   severity: fail
10:26:56   exitcode: -15
10:26:56   stack: |-
10:26:56     timeout
10:26:56     Failed: Timed out
10:26:56   ...

This test is already marked flaky on IBM i https://github.com/nodejs/node/blob/d7f193434ab699297814879835846cd4440e25ee/test/sequential/sequential.status#L30-L31 but it looks like the test runner treats timed out tests as failures (severity: fail) instead of flakes (severity: flaky).

Originally posted by @richardlau in https://github.com/nodejs/node/issues/43509#issuecomment-1215056696

richardlau commented 2 years ago

Turns out that this test is failing because the spawn of the CLI openssl tool fails, although that isn't caught/reported: https://github.com/nodejs/node/blob/5118c31a3b6f1d1759197db5293f159d3958139e/test/sequential/test-tls-psk-client.js#L18-L26

If I manually run the openssl-cli command that is being spawned I get this:

Using default temp DH parameters
00000001:error:8000002A:system library:BIO_listen:Protocol driver not attached:../deps/openssl/openssl/crypto/bio/bio_sock2.c:275:calling setsockopt()
00000001:error:10000088:BIO routines:BIO_listen:listen v6 only:../deps/openssl/openssl/crypto/bio/bio_sock2.c:277:
   0 items in the session cache
   0 client connects (SSL_connect())
   0 client renegotiates (SSL_connect())
   0 client connects that finished
   0 server accepts (SSL_accept())
   0 server renegotiates (SSL_accept())
   0 server accepts that finished
   0 session cache hits
   0 session cache misses
   0 session cache timeouts
   0 callback cache hits
   0 cache full overflows (128 allowed)

However using the system OpenSSL CLI gives:

Using default temp DH parameters
ACCEPT

Which leads to:

  1. Do we expect the built openssl-cli to work on IBM i? Possibly not going by previous comments, e.g. https://github.com/nodejs/node/pull/31967#issuecomment-591482684 which state that the IBM i port of Node.js always uses shared libs. Our CI job currently does not use shared libs which is a potential mismatch.
  2. The test should ideally detect if the openssl server hasn't started and error appropriately. It appears that currently it never terminates in this state as waitForPort spins forever waiting for the server to come up until the test runner times the test out and kills it.

cc @nodejs/platform-ibmi

V-for-Vasili commented 2 years ago

I know that in the node rpm we build with --shared-openssl, so we use the system install. Looks like we use built openssl in https://ci.nodejs.org/job/node-test-commit-ibmi/913/nodes=ibmi73-ppc64/consoleText. Also, what version is the built openssl-cli? I think 1.1.1 is the latest we support but I need to double check

richardlau commented 2 years ago

Also, what version is the built openssl-cli? I think 1.1.1 is the latest we support but I need to double check

@V-for-Vasili It would be OpenSSl 3.0.x, built from deps/openssl. FWIW I've started a CI run with CONFIG_FLAGS='--shared-openssl --shared-openssl-includes=/QOpenSys/usr/include --shared-openssl-libpath=/QOpenSys/lib --dest-cpu=ppc64': https://ci.nodejs.org/job/node-test-commit-ibmi/915/

V-for-Vasili commented 2 years ago

Yep, we don't support 3.x on Ibmi, and we use 1.1.1 with the node rpm. 3.x is work in progress currently.

richardlau commented 2 years ago

FWIW I've started a CI run with CONFIG_FLAGS='--shared-openssl --shared-openssl-includes=/QOpenSys/usr/include --shared-openssl-libpath=/QOpenSys/lib --dest-cpu=ppc64': https://ci.nodejs.org/job/node-test-commit-ibmi/915/

Looks like that failed while linking 😞 .

richardlau commented 2 years ago

Further digging on test-iinthecloud-ibmi73-ppc64_be-1 showed that the system openssl command in /QOpenSys/bin/ was 1.0.2q. Installing openssl via yum installs OpenSSL 1.1.1n into /QOpenSys/pkgs/bin/ and that has the same "Protocol driver not attached" error.

Turns out we've seen this before: https://github.com/nodejs/node/issues/42152#issuecomment-1054850803

richardlau commented 2 years ago

(FWIW after https://github.com/nodejs/node/pull/44824 the test no longer times out but fails with the "Protocol driver not attached" error from the call to the openssl CLI (either system when compiled with --shared-openssl or the one built from deps/openssl.)

abmusse commented 1 year ago

https://github.com/nodejs/node/issues/42152#issuecomment-1459129701