Open geofft opened 1 year ago
OpenSSL 1.1.x is EOL on 2023-09-11.
@xunnanxu could you take a look please?
Seems pretty reasonable to me. That said I'm probably not the exact right person for review. Maybe consider opening a linked issue in Pytorch code based and tag it with oncall: distributed
to make sure this gets properly reviewed?
OpenSSL 1.x reaches end-of-life in September, and recent distros like Ubuntu 22.04+ (last year) and Debian 12+ (next month) ship only OpenSSL 3.
I have gloo (inside PyTorch) working with OpenSSL 3.x as far as I can tell everything works fine. The APIs it uses are both API- and ABI-compatible between 1.1 and 3.x. (This is important because PyTorch configures gloo with
USE_TCP_OPENSSL_LOAD
, i.e., it dlopens the library instead of compiling against it.) But there are a few things to adjust:find_package(OpenSSL 1.1 REQUIRED EXACT)
, which fails out on 3.0. Something likefind_package(OpenSSL 1.1...<4.0 REQUIRED)
would be better. Alternatively, perhaps this shouldn't be invoked at all in theUSE_TCP_OPENSSL_LOAD
case, since OpenSSL isn't needed at build time then?gloo/transport/tcp/tls/openssl.cc
attempts to dlopenlibssl.so
, if present, elselibssl.so.1.1
. The first library is only available if the development package for OpenSSL is installed. And the development package can be any version (3.x, 4.x, etc.) It's probably safer to make thislibssl.so.1.1
+libssl.so.3
(all 3.x uses the same soname).If a PR is helpful I can do the CLA dance but hopefully this is simple enough that the more interesting thing is agreeing on what the change is.