Open kEu3lWPoZ8XRv1O3hEt5xSQQJM0TvH2k opened 2 months ago
Thanks for the report! Any clue why that key installation is failing in the first place? Can you share a longer log including lines since the time the client connected to the server?
My guess is that DCO is not finding the peer to install the key for. If this is the case, then there is a bug before the key installation and the failed key installation is just a symptom.
note: I am deleting your post in the ovpn-dco ticket to avoid confusion.
note2: I changed the title as this is not an actual crash, but just a fatal condition leading to unexpected (but graceful) exit
I would say this is issue is two-fold:
Any clue why that key installation is failing in the first place?
@ordex As far as I understand, the error when initializing the key context is primary, since nothing unusual happens in the dmesg
logs. For example, the ovpn-dco
module is not being unloaded. There are just some errors related to data encryption. The verification was carried out both in the normal operation mode of the DCO module and with the module compiled in debugging mode. No changes were detected in dmesg
, the module is not being unloaded.
The OpenVPN process itself is being completed, the operation of the ovpn-dco
module remains unchanged.
The issue contains logs from dmesg
and OpenVPN logs during the service crash. The time is the same, you can make sure that the ovpn-dco
module does not really unload during the crash:
dmesg
logs before fixing:
2024-04-17T16:05:27.606710+03:00 openvpn kernel: [69250.222529] ovpn_udp_encap_recv: received data from unknown peer (id: 465)
2024-04-17T16:05:27.950720+03:00 openvpn kernel: [69250.565701] ovpn_udp_encap_recv: received data from unknown peer (id: 465)
2024-04-17T16:05:27.950858+03:00 openvpn kernel: [69250.566003] ovpn_udp_encap_recv: received data from unknown peer (id: 465)
2024-04-17T16:05:28.606691+03:00 openvpn kernel: [69251.220475] ovpn_udp_encap_recv: received data from unknown peer (id: 465)
2024-04-17T16:05:28.606723+03:00 openvpn kernel: [69251.220713] ovpn_udp_encap_recv: received data from unknown peer (id: 465)
2024-04-17T16:05:30.850734+03:00 openvpn kernel: [69253.463873] tun0: deleting peer with id 670, reason 1
2024-04-17T16:05:32.674766+03:00 openvpn kernel: [69255.289142] net_ratelimit: 56 callbacks suppressed
2024-04-17T16:05:32.674919+03:00 openvpn kernel: [69255.289150] ovpn_aead_decrypt: decrypt failed: -74
2024-04-17T16:05:32.674943+03:00 openvpn kernel: [69255.289499] ovpn_decrypt_one: error during decryption for peer 611, key-id 0: -74
2024-04-17T16:05:32.694773+03:00 openvpn kernel: [69255.308232] ovpn_aead_decrypt: decrypt failed: -74
2024-04-17T16:05:32.695002+03:00 openvpn kernel: [69255.308469] ovpn_decrypt_one: error during decryption for peer 611, key-id 0: -74
2024-04-17T16:05:32.695071+03:00 openvpn kernel: [69255.308720] ovpn_aead_decrypt: decrypt failed: -74
2024-04-17T16:05:32.695118+03:00 openvpn kernel: [69255.308974] ovpn_decrypt_one: error during decryption for peer 611, key-id 0: -74
2024-04-17T16:05:32.695233+03:00 openvpn kernel: [69255.309238] ovpn_aead_decrypt: decrypt failed: -74
2024-04-17T16:05:32.695308+03:00 openvpn kernel: [69255.309506] ovpn_decrypt_one: error during decryption for peer 611, key-id 0: -74
2024-04-17T16:05:32.695348+03:00 openvpn kernel: [69255.309820] ovpn_aead_decrypt: decrypt failed: -74
2024-04-17T16:05:32.695397+03:00 openvpn kernel: [69255.310104] ovpn_decrypt_one: error during decryption for peer 611, key-id 0: -74
2024-04-17T16:05:34.190782+03:00 openvpn kernel: [69256.805266] tun0: deleting peer with id 433, reason 0
2024-04-17T16:05:34.190799+03:00 openvpn kernel: [69256.805276] tun0: deleting peer with id 447, reason 0
2024-04-17T16:05:34.190801+03:00 openvpn kernel: [69256.805286] tun0: deleting peer with id 38, reason 0
2024-04-17T16:05:34.190802+03:00 openvpn kernel: [69256.805290] tun0: deleting peer with id 463, reason 0
2024-04-17T16:05:34.190805+03:00 openvpn kernel: [69256.805297] tun0: deleting peer with id 553, reason 0
2024-04-17T16:05:34.190808+03:00 openvpn kernel: [69256.805302] tun0: deleting peer with id 530, reason 0
2024-04-17T16:05:34.190809+03:00 openvpn kernel: [69256.805305] tun0: deleting peer with id 398, reason 0
2024-04-17T16:05:34.190811+03:00 openvpn kernel: [69256.805309] tun0: deleting peer with id 389, reason 0
2024-04-17T16:05:34.190812+03:00 openvpn kernel: [69256.805315] tun0: deleting peer with id 272, reason 0
2024-04-17T16:05:34.190814+03:00 openvpn kernel: [69256.805324] tun0: deleting peer with id 569, reason 0
2024-04-17T16:05:34.190815+03:00 openvpn kernel: [69256.805331] tun0: deleting peer with id 86, reason 0
2024-04-17T16:05:34.190816+03:00 openvpn kernel: [69256.805339] tun0: deleting peer with id 202, reason 0
2024-04-17T16:05:34.190818+03:00 openvpn kernel: [69256.805345] tun0: deleting peer with id 236, reason 0
2024-04-17T16:05:34.190820+03:00 openvpn kernel: [69256.805352] tun0: deleting peer with id 391, reason 0
2024-04-17T16:05:34.190821+03:00 openvpn kernel: [69256.805356] tun0: deleting peer with id 368, reason 0
2024-04-17T16:05:34.190822+03:00 openvpn kernel: [69256.805360] tun0: deleting peer with id 208, reason 0
2024-04-17T16:05:34.190823+03:00 openvpn kernel: [69256.805369] tun0: deleting peer with id 642, reason 0
2024-04-17T16:05:34.190824+03:00 openvpn kernel: [69256.805379] tun0 (unregistering): deleting peer with id 352, reason 0
2024-04-17T16:05:34.190825+03:00 openvpn kernel: [69256.805448] tun0 (unregistering): deleting peer with id 687, reason 0
2024-04-17T16:05:34.190826+03:00 openvpn kernel: [69256.805454] tun0 (unregistering): deleting peer with id 575, reason 0
Example of an OpenVPN logs during a crash (#2):
2024-04-17 16:05:34 us=166246 client.name123/client_source_ip_v4_address:client_source_port peer-id=465 peer info: IV_SSL=OpenSSL_3.XXX
2024-04-17 16:05:34 us=166386 client.name123/client_source_ip_v4_address:client_source_port peer-id=465 UDPv4 WRITE [311] to [AF_INET]client_source_ip_v4_address:client_source_port: P_CONTROL_V1 kid=1 [ ] pid=1382 DATA len=297
2024-04-17 16:05:34 us=171898 client.name123/client_source_ip_v4_address:client_source_port peer-id=465 UDPv4 READ [74] from [AF_INET]client_source_ip_v4_address:client_source_port: P_ACK_V1 kid=1 [ ] DATA len=64
2024-04-17 16:05:34 us=172148 client.name123/client_source_ip_v4_address:client_source_port peer-id=465 UDPv4 READ [74] from [AF_INET]client_source_ip_v4_address:client_source_port: P_ACK_V1 kid=1 [ ] DATA len=64
2024-04-17 16:05:34 us=172201 client.name123/client_source_ip_v4_address:client_source_port peer-id=465 Control Channel: TLSv1.3, cipher TLSv1.3 TLS_AES_256_GCM_SHA384, peer certificate: 256 bit ED25519, signature: ED25519
2024-04-17 16:05:34 us=172294 client.name123/client_source_ip_v4_address:client_source_port peer-id=465 dco_install_key: peer_id=465 keyid=1, currently 1 keys installed
2024-04-17 16:05:34 us=172320 client.name123/client_source_ip_v4_address:client_source_port peer-id=465 dco_new_key: slot 1, key-id 1, peer-id 465, cipher AES-256-GCM
2024-04-17 16:05:34 us=172384 client.name123/client_source_ip_v4_address:client_source_port peer-id=465 dco_new_key: netlink reports object not found, ovpn-dco unloaded?
2024-04-17 16:05:34 us=172408 client.name123/client_source_ip_v4_address:client_source_port peer-id=465 dco_new_key: failed to send netlink message: No such file or directory (-2)
2024-04-17 16:05:34 us=172432 client.name123/client_source_ip_v4_address:client_source_port peer-id=465 Impossible to install key material in DCO: No such file or directory
2024-04-17 16:05:34 us=172450 client.name123/client_source_ip_v4_address:client_source_port peer-id=465 Exiting due to fatal error
2024-04-17 16:05:34 us=172513 client.name123/client_source_ip_v4_address:client_source_port peer-id=465 Closing DCO interface
2024-04-17 16:05:34 us=172536 client.name123/client_source_ip_v4_address:client_source_port peer-id=465 net_addr_v4_del: ip_v4_address dev tun0
2024-04-17 16:05:34 us=173286 client.name123/client_source_ip_v4_address:client_source_port peer-id=465 sitnl_send: checking for received messages
2024-04-17 16:05:34 us=173351 client.name123/client_source_ip_v4_address:client_source_port peer-id=465 sitnl_send: rtnl: received 36 bytes
2024-04-17 16:05:34 us=173384 client.name123/client_source_ip_v4_address:client_source_port peer-id=465 net_addr_v6_del: ip_v6_address/112 dev tun0
2024-04-17 16:05:34 us=173510 client.name123/client_source_ip_v4_address:client_source_port peer-id=465 sitnl_send: checking for received messages
2024-04-17 16:05:34 us=173532 client.name123/client_source_ip_v4_address:client_source_port peer-id=465 sitnl_send: rtnl: received 36 bytes
2024-04-17 16:05:34 us=173551 client.name123/client_source_ip_v4_address:client_source_port peer-id=465 close_tun_dco
2024-04-17 16:05:34 us=173581 client.name123/client_source_ip_v4_address:client_source_port peer-id=465 net_iface_del: delete tun0
2024-04-17 16:05:34 us=370712 client.name123/client_source_ip_v4_address:client_source_port peer-id=465 sitnl_send: checking for received messages
2024-04-17 16:05:34 us=370770 client.name123/client_source_ip_v4_address:client_source_port peer-id=465 sitnl_send: rtnl: received 36 bytes
--- OpenVPN process restart goes here
@kEu3lWPoZ8XRv1O3hEt5xSQQJM0TvH2k I was hoping to get an earlier log from OpenVPN (not kernel/ovpn-dco log) that shows the full connection of the client (when also the dco_new_peer is invoked). I also don't expect to see other errors, but I just want to see the sequence of events.
as you can see at the very beginning of the dmesg log:
2024-04-17T16:05:28.606723+03:00 openvpn kernel: [69251.220713] ovpn_udp_encap_recv: received data from unknown peer (id: 465)
peer 465 was already unknown.
This means this peer was removed at some point earlier on or this peer was never added at all. But can't say which one is true with the log starting there.
I hope I managed to convey my point
edit: the message above is printed at 16:05:28, which is earlier than the error showing up in openvpn at 16:05:34
@ordex I will try to provide all the necessary information as soon as possible.
Good afternoon, @ordex !
Thank you for waiting. 🤝
Here are the complete logs from the moment the client connects before the OpenVPN process ends. I can assume that you may be interested in earlier logs using such a peer id. If so, please write. I will try to provide all the necessary information.
2024-04-17 16:05:34 us=163397 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 UDPv4 READ [66] from [AF_INET][client_source_ip_v4_address123:client_source_port123]: P_ACK_V1 kid=1 [ ] DATA len=56
2024-04-17 16:05:34 us=163495 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 UDPv4 READ [1324] from [AF_INET][client_source_ip_v4_address123:client_source_port123]: P_CONTROL_V1 kid=1 [ ] pid=870 DATA len=1310
2024-04-17 16:05:34 us=164504 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 VERIFY OK: depth=1, CN=Easy-RSA CA
2024-04-17 16:05:34 us=165057 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 VERIFY KU OK
2024-04-17 16:05:34 us=165094 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 Validating certificate extended key usage
2024-04-17 16:05:34 us=165115 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 ++ Certificate has EKU (str) TLS Web Client Authentication, expects TLS Web Client Authentication
2024-04-17 16:05:34 us=165133 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 VERIFY EKU OK
2024-04-17 16:05:34 us=165150 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 VERIFY OK: depth=0, CN=[client.name123]
2024-04-17 16:05:34 us=165815 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 UDPv4 WRITE [228] to [AF_INET][client_source_ip_v4_address123:client_source_port123]: P_CONTROL_V1 kid=1 [ ] pid=1126 DATA len=214
2024-04-17 16:05:34 us=165907 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 UDPv4 READ [483] from [AF_INET][client_source_ip_v4_address123:client_source_port123]: P_CONTROL_V1 kid=1 [ ] pid=1126 DATA len=469
2024-04-17 16:05:34 us=165976 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 peer info: IV_VER=3.8.2connect3
2024-04-17 16:05:34 us=166005 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 peer info: IV_PLAT=win
2024-04-17 16:05:34 us=166025 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 peer info: IV_NCP=2
2024-04-17 16:05:34 us=166044 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 peer info: IV_TCPNL=1
2024-04-17 16:05:34 us=166066 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 peer info: IV_PROTO=990
2024-04-17 16:05:34 us=166085 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 peer info: IV_MTU=1600
2024-04-17 16:05:34 us=166114 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 peer info: IV_CIPHERS=AES-128-CBC:AES-192-CBC:AES-256-CBC:AES-128-GCM:AES-192-GCM:AES-256-GCM:CHACHA20-POLY1305
2024-04-17 16:05:34 us=166134 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 peer info: IV_AUTO_SESS=1
2024-04-17 16:05:34 us=166153 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 peer info: UV_ASCLI_VER=3.4.3-3337
2024-04-17 16:05:34 us=166171 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 peer info: UV_UUID=[client_uv_uuid123]
2024-04-17 16:05:34 us=166190 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 peer info: IV_GUI_VER=OCWindows_3.4.3-3337
2024-04-17 16:05:34 us=166209 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 peer info: IV_SSO=webauth,crtext
2024-04-17 16:05:34 us=166228 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 peer info: IV_HWADDR=[client_hwaddr_123]
2024-04-17 16:05:34 us=166246 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 peer info: IV_SSL=OpenSSL_3.1.4_24_Oct_2023
2024-04-17 16:05:34 us=166386 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 UDPv4 WRITE [311] to [AF_INET][client_source_ip_v4_address123:client_source_port123]: P_CONTROL_V1 kid=1 [ ] pid=1382 DATA len=297
2024-04-17 16:05:34 us=166468 [client.name456]/[client_source_ip_v4_address456:client_source_port456] peer-id=79 UDPv4 READ [74] from [AF_INET][client_source_ip_v4_address456:client_source_port456]: P_ACK_V1 kid=0 [ ] DATA len=64
2024-04-17 16:05:34 us=171898 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 UDPv4 READ [74] from [AF_INET][client_source_ip_v4_address123:client_source_port123]: P_ACK_V1 kid=1 [ ] DATA len=64
2024-04-17 16:05:34 us=172148 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 UDPv4 READ [74] from [AF_INET][client_source_ip_v4_address123:client_source_port123]: P_ACK_V1 kid=1 [ ] DATA len=64
2024-04-17 16:05:34 us=172201 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 Control Channel: TLSv1.3, cipher TLSv1.3 TLS_AES_256_GCM_SHA384, peer certificate: 256 bit ED25519, signature: ED25519
2024-04-17 16:05:34 us=172294 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 dco_install_key: peer_id=465 keyid=1, currently 1 keys installed
2024-04-17 16:05:34 us=172320 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 dco_new_key: slot 1, key-id 1, peer-id 465, cipher AES-256-GCM
2024-04-17 16:05:34 us=172384 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 dco_new_key: netlink reports object not found, ovpn-dco unloaded?
2024-04-17 16:05:34 us=172408 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 dco_new_key: failed to send netlink message: No such file or directory (-2)
2024-04-17 16:05:34 us=172432 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 Impossible to install key material in DCO: No such file or directory
2024-04-17 16:05:34 us=172450 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 Exiting due to fatal error
2024-04-17 16:05:34 us=172513 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 Closing DCO interface
2024-04-17 16:05:34 us=172536 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 net_addr_v4_del: ip_v4_address dev tun0
2024-04-17 16:05:34 us=173286 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 sitnl_send: checking for received messages
2024-04-17 16:05:34 us=173351 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 sitnl_send: rtnl: received 36 bytes
2024-04-17 16:05:34 us=173384 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 net_addr_v6_del: ip_v6_address/112 dev tun0
2024-04-17 16:05:34 us=173510 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 sitnl_send: checking for received messages
2024-04-17 16:05:34 us=173532 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 sitnl_send: rtnl: received 36 bytes
2024-04-17 16:05:34 us=173551 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 close_tun_dco
2024-04-17 16:05:34 us=173581 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 net_iface_del: delete tun0
2024-04-17 16:05:34 us=370712 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 sitnl_send: checking for received messages
2024-04-17 16:05:34 us=370770 [client.name123]/[client_source_ip_v4_address123:client_source_port123] peer-id=465 sitnl_send: rtnl: received 36 bytes
--- OpenVPN process restart goes here
Yeah, I wanted to find the related dco_new_peer call, normally happening after connection. It seems we have none here, which indicates that the peer was not created, thus we have a failing new_key call.
Indeed it'd be interesting to see any past log related to peer-id=465, possibly also from dmesg.
Thanks a lot for all this!
Thanks for the reply, @ordex.
I will try to send you the logs tomorrow (April 25th).
Good afternoon, @ordex!
Thanks for waiting. I've prepared the information you requested. There are filtered samples from openvpn.log (OpenVPN service logs) and kern.log (kernel logs).
Filtering was performed by the peer ID starting at midnight and ending with the fall of OpenVPN. I have tried to keep as much relevant information as possible.
Here they are: kern.log openvpn.log
Is there anything else I can do to help?
Thanks a lot! This looks interesting. Allow me some time to dig into the text and I'll report back
Good afternoon, @ordex!
Can you tell me, please, if there is any news?
Thanks!
Sorry, not yet. I'm quite busy with the upstream submission and I haven't had time to dig deep enough. I should be able to jump on this next week.
Describe the bug
Good afternoon!
We encountered such a problem on our installation of an OpenVPN server using the DCO core module. The problem causes the OpenVPN process to crash with disconnecting of all clients connected to OpenVPN. The problem is critical in the conditions of production.
Initial issue's comment inside the ovpn-dco repository: https://github.com/OpenVPN/ovpn-dco/issues/50#issuecomment-2071953786
To Reproduce
Expected behavior The OpenVPN process does not crash due to a single client's key context initialization error.
Version information (please complete the following information):
2.6.3-1 (deb12u2)
.Debian Bookworm (v12.5)
.master
from 2024.04.16).Additional context
Example of an OpenVPN logs during a crash (#1):
Example of an OpenVPN logs during a crash (#2):
Crash reason:
https://github.com/OpenVPN/openvpn/blob/32e6586687a548174b88b64fe54bfae6c74d4c19/src/openvpn/ssl.c#L1361 ^
M_FATAL
causes the process to crash after an unsuccessful attempt to install key material in the DCO module.Confirmation:
https://github.com/OpenVPN/openvpn/blob/master/src/openvpn/error.h#L142 ^ Calling the
msg()
procedure with theM_FATAL
flag results in an emergency (code1
) shutdown of the process.Problem fixing scheme:
M_FATAL
flag withM_WARN
in the problem source code locationinit_key_contexts
procedure into a function that returnsbool
false
insideinit_key_context
in case of failure, otherwise returntrue
init_key_context
, the case of returningfalse
(failed initialization of the key context)Patch with fix (spot): diff.patch
Patch with fix (full): full_diff.patch
dmesg
logs before fixing:OpenVPN logs after correction:
dmesg
logs after fixing:Please pay attention to this problem. I will monitor this issue and respond if necessary. Thanks!