nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
106.63k stars 29.07k forks source link

OpenSSL 1.1.1 Support TODO (pls help) #18770

Closed rvagg closed 5 years ago

rvagg commented 6 years ago

@nodejs/crypto

OpenSSL 1.1.1-pre1 was released today. The headline item is TLS 1.3 (worth noting that the spec hasn't quite been finalised yet). This is obviously only a pre-release, not final and not supposed to be entirely bug free.

The OpenSSL team have said previously that 1.1.1 would be API and ABI compatible with 1.1.0. We currently have 1.1.0 support in Node so the theory goes that it shouldn't be too difficult an upgrade path. This is nice because it's possible (but not yet known) that 1.1.1 is the next LTS of OpenSSL, with 1.1.0 going EOL soon. 1.1.0 -> 1.1.1 or just straight to 1.1.1 might have to be our Node 10 strategy (I'm outlining that case here).

So, getting as close to 1.1.1 support as possible even while it's pre-release would be very valuable for us. Maintaining 1.0.2 and 1.1.0 support in the meantime is preferable (perhaps essential thanks to distribution dependencies). There will be a time, after 1.0.2 EOL next year, that we can ditch all the cruft but for now if we can do all 3 then that's what we should do.

Our CI tests 1.0.2 (obviously) and 1.0.2 dynamically linked. It also tests dynamic linking to 1.1.0 in Node 9+ (soon 8 too I think). See https://ci.nodejs.org/job/node-test-commit-linux-containered/ for this happening.

I've also 1.1.1-pre1 to the same containers that are used to run these other dynamic-linked tests and I can turn that on as needed. For now it's too broken to turn on full-time, so this is the call to help fix that!

Node compiles just fine with 1.1.1-pre1 thanks to @davidben's most excellent work in #16130. But it currently fails 55 tests in CI (there may be at least one async-wrap flaky in there).

We need help figuring out whether these are things that we can fix on our end or whether they are upstream problems. If OpenSSL 1.1.1 isn't properly API compatible with 1.1.0 then I'd like us to push back on them to get them to stick to that commitment.

not ok 36 parallel/test-async-wrap-GH13045
not ok 953 parallel/test-https-agent-create-connection
not ok 957 parallel/test-https-agent-session-reuse
not ok 964 parallel/test-https-client-resume
not ok 969 parallel/test-https-agent-additional-options
not ok 1130 parallel/test-http2-https-fallback
not ok 1230 parallel/test-https-drain
not ok 1234 parallel/test-https-eof-for-eom
not ok 1519 parallel/test-tls-alpn-server-client
not ok 1533 parallel/test-tls-client-getephemeralkeyinfo
not ok 1534 parallel/test-tls-client-mindhsize
not ok 1535 parallel/test-tls-client-reject
not ok 1536 parallel/test-tls-addca
not ok 1537 parallel/test-tls-alert-handling
not ok 1539 parallel/test-tls-async-cb-after-socket-end
not ok 1542 parallel/test-tls-close-notify
not ok 1549 parallel/test-tls-connect-stream-writes
not ok 1553 parallel/test-tls-client-resume
not ok 1554 parallel/test-tls-disable-renegotiation
not ok 1555 parallel/test-tls-ecdh
not ok 1556 parallel/test-tls-ecdh-auto
not ok 1558 parallel/test-tls-ecdh-multiple
not ok 1562 parallel/test-tls-env-extra-ca
not ok 1566 parallel/test-tls-client-verify
not ok 1568 parallel/test-tls-getcipher
not ok 1569 parallel/test-tls-connect-given-socket
not ok 1576 parallel/test-tls-dhe
not ok 1577 parallel/test-tls-friendly-error-message
not ok 1583 parallel/test-tls-multi-key
not ok 1584 parallel/test-tls-multi-pfx
not ok 1585 parallel/test-tls-interleave
not ok 1586 parallel/test-tls-invoke-queued
not ok 1590 parallel/test-tls-npn-server-client
not ok 1591 parallel/test-tls-ocsp-callback
not ok 1592 parallel/test-tls-js-stream
not ok 1599 parallel/test-tls-peer-certificate-encoding
not ok 1600 parallel/test-tls-peer-certificate-multi-keys
not ok 1602 parallel/test-tls-net-connect-prefer-path
not ok 1607 parallel/test-tls-securepair-server
not ok 1608 parallel/test-tls-no-rsa-key
not ok 1611 parallel/test-tls-server-verify
not ok 1612 parallel/test-tls-on-empty-socket
not ok 1614 parallel/test-tls-set-ciphers
not ok 1615 parallel/test-tls-sni-option
not ok 1616 parallel/test-tls-sni-server-client
not ok 1618 parallel/test-tls-socket-constructor-alpn-npn-options-parsing
not ok 1619 parallel/test-tls-regr-gh-5108
not ok 1625 parallel/test-tls-ticket
not ok 1626 parallel/test-tls-ticket-cluster
not ok 1648 parallel/test-tls-server-connection-server
not ok 1871 async-hooks/test-tlswrap
not ok 1878 async-hooks/test-writewrap
not ok 1945 parallel/test-tls-set-encoding
not ok 1952 parallel/test-tls-socket-default-options
not ok 2012 sequential/test-benchmark-tls

Full output is captured here https://gist.github.com/rvagg/cdead09ffa269453d728dcf9bc831d3d (it comes from here but that link is not going to survive).

sam-github commented 5 years ago

Took a shot at figuring out whether updating OpenSSL on v10.x will break anything.

Ran tests on the v10.x branch: https://ci.nodejs.org/job/node-test-commit/24242

Passed everything, except a build against openssl 1.1.0, where TLS1_3_VERSION macro is missing, which I fixed in both "upgrade" branches. Failure here for the record.

With a v10.x that uses openssl1.1.1a, I tried to test backwards compat, but finding an npm module that:

  1. is native
  2. uses openssl; and
  3. has seen a release in the last year
  4. builds and passes its tests

... was not so easy. I'm happy to take suggestions. I searched for openssl and crypto keywords on npmjs.com, and also went through all the citgm modules tagged "native".

... I found a handful of others that didn't build or pass their tests, didn't include them.

sam-github commented 5 years ago

@MylesBorins :point_up_2:

sam-github commented 5 years ago

I also took a shot at searching github for openssl extension:.gyp language:JavaScript pushed:>2018-01-01, it didn't really work. I found nothing I didn't already find through npm. Maybe because it doesn't exist? If anyone knows a way to find all repos with a top-level binding.gyp, and that do #include.*openssl somewhere, that'd be really useful.

kapouer commented 5 years ago

@sam-github uws and grpc both link against openssl. For uws, you can test on node 10 using @kapouer/uws.

sam-github commented 5 years ago

@kapouer I could use some help, what I found was:

noloader commented 5 years ago

@sam-github,

... but they always fail because of a missing deps/grpc/etc/roots.pem ...

Regarding the missing PEM file, the following may help:

echo "-----BEGIN CERTIFICATE-----
MIIDdTCCAl2gAwIBAgILBAAAAAABFUtaw5QwDQYJKoZIhvcNAQEFBQAwVzELMAkG
A1UEBhMCQkUxGTAXBgNVBAoTEEdsb2JhbFNpZ24gbnYtc2ExEDAOBgNVBAsTB1Jv
b3QgQ0ExGzAZBgNVBAMTEkdsb2JhbFNpZ24gUm9vdCBDQTAeFw05ODA5MDExMjAw
MDBaFw0yODAxMjgxMjAwMDBaMFcxCzAJBgNVBAYTAkJFMRkwFwYDVQQKExBHbG9i
YWxTaWduIG52LXNhMRAwDgYDVQQLEwdSb290IENBMRswGQYDVQQDExJHbG9iYWxT
aWduIFJvb3QgQ0EwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDaDuaZ
jc6j40+Kfvvxi4Mla+pIH/EqsLmVEQS98GPR4mdmzxzdzxtIK+6NiY6arymAZavp
xy0Sy6scTHAHoT0KMM0VjU/43dSMUBUc71DuxC73/OlS8pF94G3VNTCOXkNz8kHp
1Wrjsok6Vjk4bwY8iGlbKk3Fp1S4bInMm/k8yuX9ifUSPJJ4ltbcdG6TRGHRjcdG
snUOhugZitVtbNV4FpWi6cgKOOvyJBNPc1STE4U6G7weNLWLBYy5d4ux2x8gkasJ
U26Qzns3dLlwR5EiUWMWea6xrkEmCMgZK9FGqkjWZCrXgzT/LCrBbBlDSgeF59N8
9iFo7+ryUp9/k5DPAgMBAAGjQjBAMA4GA1UdDwEB/wQEAwIBBjAPBgNVHRMBAf8E
BTADAQH/MB0GA1UdDgQWBBRge2YaRQ2XyolQL30EzTSo//z9SzANBgkqhkiG9w0B
AQUFAAOCAQEA1nPnfE920I2/7LqivjTFKDK1fPxsnCwrvQmeU79rXqoRSLblCKOz
yj1hTdNGCbM+w6DjY1Ub8rrvrTnhQ7k4o+YviiY776BQVvnGCv04zcQLcFGUl5gE
38NflNUVyRRBnMRddWQVDf9VMOyGj/8N7yy5Y0b2qvzfvGn9LhJIZJrglfCm7ymP
AbEVtQwdpf5pLGkkeB6zpxxxYu7KyJesF12KwvhHhm4qxFYxldBniYUr+WymXUad
DKqC5JlR3XC321Y9YeRq4VzW9v493kHMB65jUr9TU/Qr6cf9tveCX4XSQRjbgbME
HMUfpIBvFSDJ3gyICh3WZlXi/EjJKSZp4A==
-----END CERTIFICATE-----" > globalsign-root-r1.pem

wget -q --ca-certificate=globalsign-root-r1.pem https://curl.haxx.se/ca/cacert.pem -O cacert.pem

mkdir -p deps/grpc/etc/
cp cacert.pem deps/grpc/etc/roots.pem

I can only say "may" because I don't know what is expected in roots.pem.

sam-github commented 5 years ago

Helps a bit, still doesn't build. I ran all the gulp targets, twice (in case there was an order dependence). Always fails with:

> grpc@1.16.1 build /home/sam/s/grpc-node/packages/grpc-native-core                                                                                                          
> node-pre-gyp build                                                                                                                                                         

make: Entering directory '/home/sam/s/grpc-node/packages/grpc-native-core/build'                                                                                             
make: *** No rule to make target 'Release/obj.target/grpc/deps/grpc/src/core/lib/surface/init.o', needed by 'Release/obj.target/libgrpc.a'.  Stop. 

If you can give me directions to test from a clean git clone, I will try again. You can do it, too, might be easier, just build my update_openssl1.1.a-v10.x branch.

sam-github commented 5 years ago

There doesn't seem to be any blockers, or activity, so at @mhdawson 's suggestion, I have PRed the update to OpenSSL 1.1.1a: https://github.com/nodejs/node/pull/25381

Also, I have a branch on top of the basic openssl1.1.1a for TLSv1.2 (and below) where I continue to work through the test suites getting them to either pass when run with TLSv1.3, or identify outstanding work. I'm making progress. I think I've identified how to resolve the main issues, but its hard to know when one of the tests will reveal a nasty new problem, so I will wait until I'm through them before feeling too confident.

sam-github commented 5 years ago

@shigeki @rvagg TLS1.3 update. I've run through all the test-tls- tests, and found the problems with TLS1.3, as well as read the OpenSSL release notes. I've prototyped solutions to all the problems, except the problems with the info callback. For that, I'm trying to change TLSWrap to use SSL_do_handshake() until the handshake completes. I have had some success with simple connections, but there are still problems, and I'm deep in the weeds of node C++ streams, and reentrant C++/JS.

If either of you have any cred with the OpenSSL team, please, please jump in on this thread: https://mta.openssl.org/pipermail/openssl-project/2019-January/thread.html#1204

It looks like nginx and haprox are also affected by the unnecessary API breakage between openssl 1.1.0 and 1.1.1. I can't even post to the relevant mailing list (Matt Caswell claims its moderated, but AFAICT, my emails are just bounced), maybe you two have access?

If the changes made in that thread occur, then the TLS1.3 handshake problem melts away, and I think I can get TLS1.3 working in relatively short order, just a couple new APIs, some docs, etc.

sam-github commented 5 years ago

@nodejs/tsc I've no idea how the OpenSSL core team works, but :point_up: , perhaps an official request from node is in order? I (and others on that thread) believe this to be an API breakage between releases of OpenSSL that are not supposed to have API breakage. There was a judement call made, but with implementation experience available, I think they made it the wrong way.

noloader commented 5 years ago

@sam-github,

Since you are working in C++ already, maybe Jack Lloyd's Botan would be a better for Node. Botan is a C++ TLS library. It is a cross-platform library and runs on the major platforms, including BSDs, AIX, Linux, OS X and Windows.

Development is active and the code is very clean. Jack also takes a more disciplined approach to the software engineering process (than many other projects found on the web).

One thing it is missing is Andy Polyakov's hand tuned assembly language routines. But Botan uses builtins and intrinsics so performance is not off by much. He uses builtins because of mostly universal compiler support. You can write once, run anywhere.

Something to think about in the future...

addaleax commented 5 years ago

@noloader Thanks for weighing in, but switching libraries is off-topic for this thread, as well as not easily done (we need support for more than the major platforms, and we provide the OpenSSL API to native add-ons, so changing libraries is a significant breaking change).

sam-github commented 5 years ago

TLS1.3 draft PR: https://github.com/nodejs/node/pull/26209

kapouer commented 5 years ago

@sam-github openssl 1.1.1b was released with the api fix. Maybe that means node 10 could upgrade to openssl 1.1.1b without breaking abi ?

sam-github commented 5 years ago

node 10 could upgraded to openssl 1.1.1a without breaking ABI or API, and it likely will: https://github.com/nodejs/node/pull/26270

But openssl1.1.1b means I should be able to remove a work-around from Node.js, yes (though the work-around does no harm, it basically just removes TLS1.3 cipher suites, but TLS1.3 isn't supported yet anyhow)(and the workaround goes away in the TLS1.3 PR).

We'll see when https://github.com/nodejs/node/pull/26327 lands if it does so in time to make the next 10.x. It might not, because it would need to be in an 11.x release for a while first, as I understand it. That's OK, once 10.x is using openssl 1.1.1a, then updating to 1.1.1b can be done in a patch release.

/cc @MylesBorins @rvagg

sam-github commented 5 years ago

Actually, I can't remove the work-around from Node.js even with openssl1.1.1b, because its possible to link node.js against a system openssl, and it could be openssl1.1.1a. That's OK. Like I said, the work-around disappears in https://github.com/nodejs/node/pull/26209

sam-github commented 5 years ago

We're up to openssl-1.1.1b in master, https://github.com/nodejs/node/pull/26327, and TLS1.3 has landed, https://github.com/nodejs/node/pull/26209. Thanks for all the people who helped with this.

I think this issue can be closed now, but if someone feels differently, please reopen.

And that's not to imply that help isn't still wanted for tls/crypto support. More people working on it would be very much appreciated!

panva commented 5 years ago

@volschin

@sam-github I‘m not sure, whats working automatically. getcurves should show up with X25519 and X448. Both curves should be usable inside ECDH class e.g. the Alice and Bob example from the docs. The complete test changes for X448 in openssl you can see here rhuijben/openssl@fe93b01

I've opened up #26626 for that particular feature request.