Closed tmuncks closed 9 months ago
Okay. While this may be a start, other things needs work for this to fully function with CheckMK. I get errors in certain situations, which I guess may be due to some datatype thing perhaps?
In lmd:
lmd | [2022-10-31 18:14:15.868][Warn][pid:1][filter:811] not implemented stringlist op: <
In CheckMK:
TypeError ('>' not supported between instances of 'str' and 'int')
I will try to see if I can figure this out, but really don't have a clue where to start at this point.
Well, i have no objections in mergin this. But yes, the python outputformat is in fact simple json. I am not used to python to decide if that's ok or not.
What does the <
operator do in list context?
This patch does not fully solve the issue at this point. I was hopeful because all my devices showed up, but strange stuff is happening once CheckMK starts performing it's queries.
In the case of the <
operator, it appears as if a state
entry in the result is returned as an empty string for whatever reason. I'm trying to figure out how to try to catch this and force-convert it into an int, but so far no luck (still figuring out golang and lmd).
As you mention, the python3
outputformat should just be json
, so what we need is basically be a python3
alias for the json
outputformat. But on top of that, it appears I need a way to filter or force convert certain fields.
If I could figure out a way to make this work, we would learn exactly what it takes to be compatible with CheckMK. And then we will probably better be able to address it properly.
Okay, so while I'm no closer to actually solving any of the remaining issues with CheckMK here, I believe I see what causes the problems.
For whatever reason, Livestatus data returned from LMD is sometimes the wrong type
. I the case of the <
operator error, LMD returns an integer field as an empty string, which causes the CheckMK code to fail. CheckMK will handle a missing value just fine, but it does not expect the int
field to contain a string
.
I have run a bunch of tests, to try to figure out what is happening. The closest I've gotten, is that BuildLocalResponseData
perhaps converts certain empty data structures to an empty string?
For anyone willing to have a look, I have set up a dual CheckMK + LMD test system.
cmk1
is a standard CheckMK system that monitors 8.8.8.8
(and nothing else)lmd1
is the LMD instance, configured to connect to livestatus on cmk1
cmk2
is a monitor only CheckMK (no special config) that just pulls data via livestatus from cmk1
- and it does so twice:
cmk1
lmd1
I can provide access to these systems as needed.
LMD is strongly typed, but of course only for known columns. The empty value is defined here: https://github.com/sni/lmd/blob/9cfc2f589c8a31cd7f7238d5d196a8f2cb0b5fd6/lmd/column.go#L303 So for string columns, the "empty" value is an empty string and for integers it is -1. If lmd does not know anything about a column, it will assume a string column which has "" as empty value. So i'd assume cmk requests an unknown integer column and gets confused by the response.
Cool, thanks... I will try to figure out where these known columns are defined, and see if I can generate something that works well with CheckMK...
@sni I have found the places to add the supported options, and I can see the defaults changing which is awesome. Some of the fields does not seem to be picked up properly from the source site however, and I was wondering why that might be?
Previously, an unknown field where the source livestatus returns 0
, LMD would return ""
.
After adding that particular field, source livestatus still returns 0
, but now LMD returns -1
. What might be the reason for this?
If obviously needs to reflect the value from the source site. Ideas?
CheckMK appears to do some filtering that requires that <
string operator afterall. I have no idea how to implement that.
lmd | [2022-11-24 17:39:11.315][Debug][pid:1][request:306] [172.18.10.1:40208->172.18.10.2:3333][r:1b54b6] request: GET hosts
lmd | [2022-11-24 17:39:11.315][Debug][pid:1][request:327] [172.18.10.1:40208->172.18.10.2:3333][r:1b54b6] request: Stats: state = 0
lmd | [2022-11-24 17:39:11.315][Debug][pid:1][request:327] [172.18.10.1:40208->172.18.10.2:3333][r:1b54b6] request: Stats: scheduled_downtime_depth = 0
lmd | [2022-11-24 17:39:11.315][Debug][pid:1][request:327] [172.18.10.1:40208->172.18.10.2:3333][r:1b54b6] request: StatsAnd: 2
lmd | [2022-11-24 17:39:11.315][Debug][pid:1][request:327] [172.18.10.1:40208->172.18.10.2:3333][r:1b54b6] request: Stats: scheduled_downtime_depth > 0
lmd | [2022-11-24 17:39:11.315][Debug][pid:1][request:327] [172.18.10.1:40208->172.18.10.2:3333][r:1b54b6] request: Stats: state = 2
lmd | [2022-11-24 17:39:11.315][Debug][pid:1][request:327] [172.18.10.1:40208->172.18.10.2:3333][r:1b54b6] request: Stats: scheduled_downtime_depth = 0
lmd | [2022-11-24 17:39:11.315][Debug][pid:1][request:327] [172.18.10.1:40208->172.18.10.2:3333][r:1b54b6] request: StatsAnd: 2
lmd | [2022-11-24 17:39:11.315][Debug][pid:1][request:327] [172.18.10.1:40208->172.18.10.2:3333][r:1b54b6] request: Stats: state = 1
lmd | [2022-11-24 17:39:11.315][Debug][pid:1][request:327] [172.18.10.1:40208->172.18.10.2:3333][r:1b54b6] request: Stats: scheduled_downtime_depth = 0
lmd | [2022-11-24 17:39:11.315][Debug][pid:1][request:327] [172.18.10.1:40208->172.18.10.2:3333][r:1b54b6] request: StatsAnd: 2
lmd | [2022-11-24 17:39:11.315][Debug][pid:1][request:327] [172.18.10.1:40208->172.18.10.2:3333][r:1b54b6] request: Filter: custom_variable_names < _REALNAME
lmd | [2022-11-24 17:39:11.315][Debug][pid:1][request:327] [172.18.10.1:40208->172.18.10.2:3333][r:1b54b6] request: Localtime: 1669311551
lmd | [2022-11-24 17:39:11.315][Debug][pid:1][request:779] [r:1b54b6] Ignoring Localtime as LMD works on unix timestamps only.
lmd | [2022-11-24 17:39:11.316][Debug][pid:1][request:327] [172.18.10.1:40208->172.18.10.2:3333][r:1b54b6] request: OutputFormat: python3
lmd | [2022-11-24 17:39:11.316][Debug][pid:1][request:327] [172.18.10.1:40208->172.18.10.2:3333][r:1b54b6] request: KeepAlive: on
lmd | [2022-11-24 17:39:11.316][Debug][pid:1][request:327] [172.18.10.1:40208->172.18.10.2:3333][r:1b54b6] request: ResponseHeader: fixed16
lmd | [2022-11-24 17:39:11.316][Warn][pid:1][filter:811] not implemented stringlist op: <
lmd | [2022-11-24 17:39:11.316][Warn][pid:1][filter:811] not implemented stringlist op: <
lmd | [2022-11-24 17:39:11.316][Warn][pid:1][filter:811] not implemented stringlist op: <
lmd | [2022-11-24 17:39:11.316][Warn][pid:1][filter:811] not implemented stringlist op: <
lmd | [2022-11-24 17:39:11.316][Warn][pid:1][filter:811] not implemented stringlist op: <
lmd | [2022-11-24 17:39:11.316][Warn][pid:1][filter:811] not implemented stringlist op: <
lmd | [2022-11-24 17:39:11.316][Warn][pid:1][filter:811] not implemented stringlist op: <
lmd | [2022-11-24 17:39:11.316][Warn][pid:1][filter:811] not implemented stringlist op: <
lmd | [2022-11-24 17:39:11.316][Warn][pid:1][filter:811] not implemented stringlist op: <
lmd | [2022-11-24 17:39:11.316][Warn][pid:1][filter:811] not implemented stringlist op: <
lmd | [2022-11-24 17:39:11.316][Warn][pid:1][filter:811] not implemented stringlist op: <
lmd | [2022-11-24 17:39:11.316][Warn][pid:1][filter:811] not implemented stringlist op: <
lmd | [2022-11-24 17:39:11.316][Warn][pid:1][filter:811] not implemented stringlist op: <
lmd | [2022-11-24 17:39:11.316][Warn][pid:1][filter:811] not implemented stringlist op: <
lmd | [2022-11-24 17:39:11.316][Warn][pid:1][filter:811] not implemented stringlist op: <
lmd | [2022-11-24 17:39:11.316][Warn][pid:1][filter:811] not implemented stringlist op: <
lmd | [2022-11-24 17:39:11.316][Warn][pid:1][filter:811] not implemented stringlist op: <
lmd | [2022-11-24 17:39:11.316][Warn][pid:1][filter:811] not implemented stringlist op: <
lmd | [2022-11-24 17:39:11.316][Warn][pid:1][filter:811] not implemented stringlist op: <
lmd | [2022-11-24 17:39:11.316][Warn][pid:1][filter:811] not implemented stringlist op: <
The warning is about string lists: https://github.com/sni/lmd/blob/master/lmd/filter.go#L770 Basically 'column < string' is not defined for string lists. What should that operator do? If its an alias for "contains not", then it could simply be added here: https://github.com/sni/lmd/blob/master/lmd/filter.go#L787
This pull request is still open, but the discussion has drifted to another topic. @sni will you accept the pull request? I would strongly support that.
sure, sorry for the delay.
I needed to make a few minor changes to allow CheckMK to use lmd as a livestatus source.
CheckMK uses
OutputFormat: python3
and these tiny changes seemed to do the trick. It seems to work with these changes applied.That said, I am obviously slightly curious as to why I didn't need to perform any actual modifications to the data, but there you have it.
Please let me know if I'm missing something glaringly obvious. :)