sni / lmd

Livestatus Multitool Daemon - Create livestatus federation from multiple sources
https://labs.consol.de/omd/packages/lmd/
GNU General Public License v3.0
42 stars 31 forks source link

Livestatus COMMANDs get stuck in TLS #141

Closed Bu-Ble closed 6 months ago

Bu-Ble commented 7 months ago

Setup

Thruk -> (tcp:localhost) -> LMD1 -> (tls:remote) -> LMD2 -> (unix socket) -> Icinga2

Problem

When a user triggers a command in Thruk, the command shows up in the logs of LMD1 and LMD2, but it is not being sent to Icinga2.

Analysis

After some other debugging attempts I switched the connection LMD1 -> LMD2 to TCP instead of TLS, so that I could use tcpdump to see what was going on, but the problem was not reproducible over cleartext TCP. So it seems to be related with the TLS communication. Maybe some internal buffering issue?

Logs

TLS connection (command not forwarded):

lmd1.log:

[2024-01-18 14:12:22.669][Debug][pid:14210][request:303] [127.0.0.1:24574->127.0.0.1:3333][r:6f18d2] request: COMMAND [1705412996] DEL_SVC_DOWNTIME;999
[ almost 2 minutes delay ]
[2024-01-18 14:14:18.565][Info][pid:14210][peer:2429] [to-lmd-2][127.0.0.1:24574->127.0.0.1:3333][r:e6dc10] send 1 commands successfully.

lmd2.log:

[2024-01-18 14:12:22.677][Debug][pid:26248][request:303] [X.X.X.X:51522->Y.Y.Y.Y:8034][r:501b48] request: COMMAND [1705412996] DEL_SVC_DOWNTIME;999
[ no "send 1 commands successfully" ]

TCP connection (no error):

lmd1.log:

[2024-01-18 14:21:27.256][Debug][pid:6309][request:303] [127.0.0.1:22806->127.0.0.1:3333][r:7a97ae] request: COMMAND [1705412996] DEL_SVC_DOWNTIME;999
[2024-01-18 14:21:27.260][Info][pid:6309][peer:2429] [to-lmd-2][127.0.0.1:22806->127.0.0.1:3333][r:b69728] send 1 commands successfully.

lmd2.log:

[2024-01-18 14:21:27.264][Debug][pid:26248][request:303] [X.X.X.X:61272->Y.Y.Y.Y:8033][r:88cea2] request: COMMAND [1705412996] DEL_SVC_DOWNTIME;999
[2024-01-18 14:21:27.264][Info][pid:26248][peer:2429] [icinga][Y.Y.Y.Y:61272->X.X.X.X:8033][r:c935b6] send 1 commands successfully.

Version

lmd - version 2.1.7 (Build: , go1.20.7)

sni commented 7 months ago

tbh, i never tried to connect multiple LMDs in a row. But even if this is probably a rare use case, it should work. Some ideas for a workaround. Instead of LMD2, maybe a ssh tunnel or socat tunnel might improve things. Or if LMD1 would be able to directly access the Icinga 2 Backend. But i guess there are reasons for this setup.

What i saw in the wild already was something like this: Thruk -> (unix socket) -> LMD1 -> (https:remote) -> Thruk -> (unix socket) -> Icinga2 But no idea if commands work properly in such scenario. There might be even a second LMD between the remote Thruk and Icinga2.

Bu-Ble commented 7 months ago

Hi Sven, thanks for the hint. Replacing LMD2 with socat (OPENSSL-LISTEN + UNIX-CONNECT) solved the problem. 👍