Open retinio opened 6 months ago
I have enabled log in tlshd.conf
[debug]
loglevel=1
tls=1
nl=1
and I have got extended logs λ kubectl -n linstor logs linstor-satellite.worker-01-7wgmg -c ktls-utils
tlshd[7]: Built from ktls-utils 0.10 on Oct 4 2023 07:26:06
tlshd[7]: x.509 priority string: SECURE256:+SECURE128:-COMP-ALL:-VERS-ALL:+VERS-TLS1.3:%NO_TICKETS:-CIPHER-ALL:+AES-256-GCM:+CHACHA20-POLY1305:+AES-128-GCM:+AES-128-CCM
tlshd[7]: PSK priority string: SECURE256:+SECURE128:-COMP-ALL:-VERS-ALL:+VERS-TLS1.3:%NO_TICKETS:-CIPHER-ALL:+AES-256-GCM:+CHACHA20-POLY1305:+AES-128-GCM:+AES-128-CCM:+PSK:+DHE-PSK:+ECDHE-PSK
tlshd[9]: Querying the handshake service
tlshd[8]: Querying the handshake service
tlshd[9]: Parsing a valid netlink message
tlshd[9]: No peer identities found
tlshd[9]: No certificates found
tlshd[9]: System config file: /etc/gnutls/config
tlshd[8]: Parsing a valid netlink message
tlshd[9]: Client x.509 truststore is /etc/tlshd.d/ca.crt
tlshd[8]: No peer identities found
tlshd[8]: No certificates found
tlshd[8]: System config file: /etc/gnutls/config
tlshd[8]: Client x.509 truststore is /etc/tlshd.d/ca.crt
tlshd[11]: Querying the handshake service
tlshd[11]: Parsing a valid netlink message
tlshd[8]: System trust: Loaded 1 certificate(s).
tlshd[11]: No peer identities found
tlshd[11]: No certificates found
tlshd[11]: System config file: /etc/gnutls/config
tlshd[11]: Server x.509 truststore is /etc/tlshd.d/ca.crt
tlshd[8]: Retrieved x.509 certificate from /etc/tlshd.d/tls.crt
tlshd[11]: System trust: Loaded 1 certificate(s).
tlshd[8]: Retrieved private key from /etc/tlshd.d/tls.key
tlshd[11]: Retrieved x.509 certificate from /etc/tlshd.d/tls.crt
tlshd[10]: Querying the handshake service
tlshd[10]: Parsing a valid netlink message
tlshd[10]: No peer identities found
tlshd[10]: No certificates found
tlshd[10]: System config file: /etc/gnutls/config
tlshd[10]: Server x.509 truststore is /etc/tlshd.d/ca.crt
tlshd[10]: System trust: Loaded 1 certificate(s).
tlshd[10]: Retrieved x.509 certificate from /etc/tlshd.d/tls.crt
tlshd[10]: Retrieved private key from /etc/tlshd.d/tls.key
tlshd[8]: Server's trusted authorities:
tlshd[9]: System trust: Loaded 1 certificate(s).
tlshd[8]: [0]: CN=linstor-internal-ca
tlshd[11]: Retrieved private key from /etc/tlshd.d/tls.key
tlshd[8]: The certificate is NOT trusted. The name in the certificate does not match the expected.
tlshd[8]: gnutls: Error in the certificate. (-43)
tlshd[8]: Handshake with 'worker-03' (192.168.160.22) failed
DBG<1>././lib/cache_mngt.c:302 nl_cache_mngt_unregister: Unregistered cache operations genl/family
tlshd[9]: Retrieved x.509 certificate from /etc/tlshd.d/tls.crt
tlshd[9]: Retrieved private key from /etc/tlshd.d/tls.key
tlshd[9]: Server's trusted authorities:
tlshd[9]: [0]: CN=linstor-internal-ca
tlshd[9]: The certificate is NOT trusted. The name in the certificate does not match the expected.
tlshd[9]: gnutls: Error in the certificate. (-43)
tlshd[9]: Handshake with 'worker-02' (192.168.160.21) failed
DBG<1>././lib/cache_mngt.c:302 nl_cache_mngt_unregister: Unregistered cache operations genl/family
tlshd[10]: gnutls: The TLS connection was non-properly terminated. (-110)
tlshd[11]: gnutls: The TLS connection was non-properly terminated. (-110)
tlshd[10]: Handshake with 'worker-03' (192.168.160.22) failed
tlshd[11]: Handshake with 'worker-02' (192.168.160.21) failed
DBG<1>././lib/cache_mngt.c:302 nl_cache_mngt_unregister: Unregistered cache operations genl/family
DBG<1>././lib/cache_mngt.c:302 nl_cache_mngt_unregister: Unregistered cache operations genl/family
λ kubectl -n linstor get secret linstor-satellite-internal-tls -o jsonpath="{.data['tls.crt']}" | base64 -d > tls.crt λ kubectl -n linstor get secret linstor-satellite-internal-tls -o jsonpath="{.data['ca.crt']}" | base64 -d > ca.crt λ openssl verify -CAfile ca.crt tls.crt tls.crt: OK
Looks like you used the "openssl" method from here to create those certificates?
If so, the issue is that those certificates only set a generic common name:
openssl req -new -sha256 -key satellite.key -subj "/CN=linstor-satellite" -out satellite.csr
So with strict validation, this certificate is only valid for some entity named linstor-satellite
. For LINSTOR itself this is fine, as we don't do strict hostname validation there, but for tlshd
, it means that when it sees a DRBD connection for worker-01
, but gets a certificate for linstor-satellite
it simply fails the validation.
You either need to manually add all the node names to the alternative names in the certificates:
openssl req -new -sha256 -key satellite.key -subj "/CN=linstor-satellite" -out satellite.csr -addext "subjectAltName = DNS:linstor-satellite,DNS:worker-01,DNS:worker-02,DNS:worker-03"
openssl x509 -req -in satellite.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out satellite.crt -days 3650 -sha256 -copy_extensions copy
Or you use cert-manager
and get that all automatically :smile:
@WanzenBug Thank you sooo much! Everything worked out. You might be interested. If I use the newest version of ktls-utils (0.10-6), the connection error still persists. λ kubectl -n linstor logs -l app.kubernetes.io/component=linstor-satellite -c ktls-utils
tlshd[12]: No peer identities found
tlshd[12]: No certificates found
tlshd[12]: System config file: /etc/gnutls/config
tlshd[12]: Server x.509 truststore is /etc/tlshd.d/ca.crt
tlshd[12]: System trust: Loaded 1 certificate(s).
tlshd[12]: Retrieved x.509 certificate from /etc/tlshd.d/tls.crt
tlshd[12]: Retrieved private key from /etc/tlshd.d/tls.key
tlshd[11]: System trust: Loaded 140 certificate(s).
tlshd[11]: Handshake with 'worker-02' (10.0.4.171) failed
DBG<1>././lib/cache_mngt.c:302 nl_cache_mngt_unregister: Unregistered cache operations genl/family
tlshd[11]: Server x.509 truststore is /etc/tlshd.d/ca.crt
tlshd[11]: System trust: Loaded 1 certificate(s).
tlshd[11]: Retrieved x.509 certificate from /etc/tlshd.d/tls.crt
tlshd[11]: Retrieved private key from /etc/tlshd.d/tls.key
tlshd[9]: System trust: Loaded 140 certificate(s).
tlshd[9]: Handshake with 'worker-01' (10.0.3.154) failed
DBG<1>././lib/cache_mngt.c:302 nl_cache_mngt_unregister: Unregistered cache operations genl/family
tlshd[10]: System trust: Loaded 140 certificate(s).
tlshd[10]: Handshake with 'worker-03' (10.0.5.212) failed
DBG<1>././lib/cache_mngt.c:302 nl_cache_mngt_unregister: Unregistered cache operations genl/family
tlshd[10]: Retrieved x.509 certificate from /etc/tlshd.d/tls.crt
tlshd[10]: Retrieved private key from /etc/tlshd.d/tls.key
tlshd[11]: Querying the handshake service
tlshd[11]: Parsing a valid netlink message
tlshd[11]: No peer identities found
tlshd[11]: No certificates found
tlshd[11]: System config file: /etc/gnutls/config
tlshd[11]: System trust: Loaded 140 certificate(s).
tlshd[11]: Handshake with 'worker-02' (10.0.4.171) failed
DBG<1>././lib/cache_mngt.c:302 nl_cache_mngt_unregister: Unregistered cache operations genl/family
I'm wondering why it would try to load the system trust store:
tlshd[11]: System trust: Loaded 140 certificate(s).
But sometimes it loads the right certificates instead:
tlshd[12]: Server x.509 truststore is /etc/tlshd.d/ca.crt
tlshd[12]: System trust: Loaded 1 certificate(s).
Hi! I try to configure tls for DRBD by this manual TLS for internal traffic is enable:
λ kubectl exec -n linstor deploy/linstor-controller -- linstor node list +---------------------------------------------------------------+ | Node | NodeType | Addresses | State | |======================================| | worker-01 | SATELLITE | 192.168.160.20:3367 (SSL) | Online | | worker-02 | SATELLITE | 192.168.160.21:3367 (SSL) | Online | | worker-03 | SATELLITE | 192.168.160.22:3367 (SSL) | Online | +---------------------------------------------------------------+
But drdb doesn't connect to each other λ kubectl exec -n linstor deploy/linstor-controller -- linstor r l +-------------------------------------------------------------------------------------------------------------------------------------------+ | ResourceName | Node | Port | Usage | Conns | State | | |=================================================================================== | pvc-4973e04e-44cf-49fe-9094-98dfbfda10d5 | worker-01 | 7000 | Unused | StandAlone(worker-03,worker-02) | UpToDate | | pvc-4973e04e-44cf-49fe-9094-98dfbfda10d5 | worker-02 | 7000 | Unused | StandAlone(worker-03,worker-01) | TieBreaker | | pvc-4973e04e-44cf-49fe-9094-98dfbfda10d5 | worker-03 | 7000 | InUse | StandAlone(worker-01,worker-02) | UpToDate | +-------------------------------------------------------------------------------------------------------------------------------------------+ ktls-utils containers have errors: λ kubectl -n linstor logs -l app.kubernetes.io/component=linstor-satellite -c ktls-utils
Piraeus Operator : 2.4.0 Host operating system: Almalinux 9 5.14.0-362.18.1.el9_3.x86_64 DRBD: version: 9.2.7 (api:2/proto:86-122)