Griesbacher / nagflux

A connector which copies performancedata from Nagios / Icinga(2) / Naemon to InfluxDB
GNU General Public License v2.0
65 stars 30 forks source link

Issue with sending data to InfluxDB #30

Closed misternobody closed 7 years ago

misternobody commented 7 years ago

Hey Philip,

I'm not sure that I've got a bug or it should work such, but I have following lines in the log:

2017-03-24 16:29:23 Critical: Connection type is unknown, options are: tcp, file. Input: 2017-03-24 16:29:23 Critical: Connection type is unknown, options are: tcp, file. Input: 2017-03-24 16:29:23 Critical: Connection type is unknown, options are: tcp, file. Input:

I've configured to send data from Nagios to InfluxDB. DB was created, but it's empty. My main question is: must I have livestatus installed? can nagflux work without it?

My current configuration is: nagios-4.3.1-2.el7.centos.x86_64 influxdb-1.2.2-1.x86_64 Nagflux by Philip Griesbacher v0.4.0

Nagflux config is: [main] NagiosSpoolfileFolder = "/opt/nagios/var/spool/nagfluxperfdata" NagiosSpoolfileWorker = 10 InfluxWorker = 10 MaxInfluxWorker = 50 FileBufferSize = 65536 DumpFile = "/opt/nagios/var/log/nagflux/nagflux.dump" NagfluxSpoolfileFolder = "/opt/nagios/var/nagflux" FieldSeparator = "&" BufferSize = 1000 DefaultTarget = "all"

[Log] LogFile = "/opt/logs/nagios/nagflux.log" MinSeverity = "DEBUG"

[InfluxDBGlobal] CreateDatabaseIfNotExists = true NastyString = "" NastyStringToReplace = "" HostcheckAlias = "hostcheck"

[InfluxDB "nagios"] Enabled = true Version = 1.2 Address = "http://127.0.0.1:8086" Arguments = "precision=ms&u=root&p=root&db=nagios" StopPullingDataIfDown = true

Thanks, Andrew

Griesbacher commented 7 years ago

Hi Andrew,

no it is not a problem to start Nagflux without Livestatus, you just won't get information about downtimes and notifications and such stuff. But it's true at the moment you'll receive a lot of error messages, because it's trying to parse your config and there is no clean way to disable Livestatus, because I thought nobody wants that. But you could add these Lines to your config to satisfy Nagflux:

[Livestatus]
    Type = "tcp"
    Version = "Icinga2"

I'll add a switch to the config to disable Livestatus, so it's not necessary to add such hacks.

Best Regards, Philip

misternobody commented 7 years ago

Hi Philip,

Cool, thanks. But I still don't see any data in "nagios" DB in InfluxDB.

Looks like Nagflux is reading perf files: 2017-03-27 10:43:46 Debug: Reading Directory: /opt/nagios/var/spool/nagfluxperfdata 2017-03-27 10:43:46 Debug: Reading file: /opt/nagios/var/spool/nagfluxperfdata/1490607824.perfdata.service 2017-03-27 10:43:51 Debug: Reading Directory: /opt/nagios/var/spool/nagfluxperfdata 2017-03-27 10:43:56 Debug: Reading Directory: /opt/nagios/var/spool/nagfluxperfdata

But there is nothing in InfluxDB logs and DB.

Best Regards, Andrew

misternobody commented 7 years ago

Under /var/lib/influxdb/data/ dir I just can see "_internal". As I understand correctly "nagios" dir should be also there.

Griesbacher commented 7 years ago

Yes there should be another folder, it seems like the database wasn't even created. This should be done even if there is no perfdata. You could do the following:

aj-jester commented 7 years ago

@Griesbacher I can verify I'm seeing this same issue on Nagios 4 and Nagflux. The database gets created and it keeps reading the perfdata but nothing actually gets into Influx database. Let me know if you require any further info than what I provided below and I'll be glad to send it your way. Thanks for your help.

[root@nagios-test-box nagios]# influx
Connected to http://localhost:8086 version 1.2.1
InfluxDB shell version: 1.2.1
> use nagflux
Using database nagflux
> SHOW MEASUREMENTS
>

CentOS 6 nagios-4.2.2-4.el6 nagflux-0.4.0

[main]
NagiosSpoolfileFolder = "/var/spool/nagfluxperfdata"
NagiosSpoolfileWorker = 10
InfluxWorker = 5
MaxInfluxWorker = 10
FileBufferSize = 65536
DumpFile = "/var/spool/nagflux/nagflux.dump"
NagfluxSpoolfileFolder = "/var/spool/nagflux"
FieldSeparator = "&"
BufferSize = 100000
DefaultTarget = "all"

[Log]
#leave empty for stdout
LogFile = ""
#List of Severities https://godoc.org/github.com/kdar/factorlog#Severity
MinSeverity = "DEBUG"

[Livestatus]
    # tcp or file
    Type = "file"
    # tcp: 127.0.0.1:6557 or file /var/run/live
    Address = "/var/lib/nagios/rw/live"
    # The amount to minutes to wait for livestatus to come up, if set to 0 the detection is disabled
    MinutesToWait = 2
    # Set the Version of Livestatus. Allowed are Nagios, Icinga2, Naemon.
    # If left empty Nagflux will try to detect it on it's own, which will not always work.
    Version = "Nagios"

[InfluxDBGlobal]
    CreateDatabaseIfNotExists = true
    NastyString = ""
    NastyStringToReplace = ""
    HostcheckAlias = "hostcheck"

[InfluxDB "nagflux"]
    Enabled = true
    Version = 1.2.1
    Address = "http://127.0.0.1:8086"
    Arguments = "precision=ms&db=nagflux"
    StopPullingDataIfDown = true
2017-03-28 00:01:04 Debug: Reading file: /var/spool/nagfluxperfdata/1490659261-host-perfdata
2017-03-28 00:01:09 Debug: Reading Directory: /var/spool/nagfluxperfdata
2017-03-28 00:01:14 Debug: Reading Directory: /var/spool/nagfluxperfdata
2017-03-28 00:01:14 Debug: Reading file: /var/spool/nagfluxperfdata/1490659271-service-perfdata
2017-03-28 00:01:14 Debug: Reading file: /var/spool/nagfluxperfdata/1490659271-host-perfdata
2017-03-28 00:01:19 Debug: Reading Directory: /var/spool/nagfluxperfdata
2017-03-28 00:01:24 Debug: Reading Directory: /var/spool/nagfluxperfdata
2017-03-28 00:01:24 Debug: Reading file: /var/spool/nagfluxperfdata/1490659281-service-perfdata
2017-03-28 00:01:24 Debug: Reading file: /var/spool/nagfluxperfdata/1490659281-host-perfdata
2017-03-28 00:01:29 Debug: Reading Directory: /var/spool/nagfluxperfdata
2017-03-28 00:01:34 Debug: Reading Directory: /var/spool/nagfluxperfdata
2017-03-28 00:01:34 Debug: Reading file: /var/spool/nagfluxperfdata/1490659291-service-perfdata
2017-03-28 00:01:34 Debug: Reading file: /var/spool/nagfluxperfdata/1490659291-host-perfdata
2017-03-28 00:01:39 Debug: Reading Directory: /var/spool/nagfluxperfdata
2017-03-28 00:01:44 Debug: Reading Directory: /var/spool/nagfluxperfdata
2017-03-28 00:01:44 Debug: Reading file: /var/spool/nagfluxperfdata/1490659301-service-perfdata
2017-03-28 00:01:44 Debug: Reading file: /var/spool/nagfluxperfdata/1490659301-host-perfdata

@misternobody The database gets created fine, check via influx cli.

[root@nagios-test-box nagios]# ls -l /var/lib/influxdb/data/
total 4
drwx------. 3 influxdb influxdb 4096 Mar 27 23:05 _internal

[root@nagios-test-box nagios]# influx
Connected to http://localhost:8086 version 1.2.1
InfluxDB shell version: 1.2.1
> show databases
name: databases
name
----
_internal
nagflux

>
Griesbacher commented 7 years ago

@aj-jester Did it work with your setup with an older version of Nagflux?

misternobody commented 7 years ago

@Griesbacher @aj-jester yep, the database was created successfully. I checked logs and was seeing requests from Admin interface, but there were not anything from Nagflux. So, looks like Nagflux sends data to InfluxDB in incorrect way.

I've tried 3 different versions of InfluxDB: 0.9.5, 0.9.6 and 1.2.1, but without any success.

Griesbacher commented 7 years ago

It should work basically, at least here is some prove ;) https://circleci.com/gh/Griesbacher/nagflux/245 I'm still guessing there is a misconfiguration which I'm not seeing at the moment.

aj-jester commented 7 years ago

@Griesbacher I tried it with v0.3.1 and still no data. I'm curious though, in your tests, the example data you are using is the format like this? This is the default format Nagios uses btw.

[HOSTPERFDATA]  1490705601      localhost       4.019   PING OK - Packet loss = 0%, RTA = 0.04 ms    rta=0.041000ms;3000.000000;5000.000000;0.000000 pl=0%;80;100;0
[HOSTPERFDATA]  1490705901      localhost       4.011   PING OK - Packet loss = 0%, RTA = 0.03 ms    rta=0.032000ms;3000.000000;5000.000000;0.000000 pl=0%;80;100;0
[HOSTPERFDATA]  1490706201      localhost       4.005   PING OK - Packet loss = 0%, RTA = 0.03 ms    rta=0.032000ms;3000.000000;5000.000000;0.000000 pl=0%;80;100;0
[HOSTPERFDATA]  1490706501      localhost       4.004   PING OK - Packet loss = 0%, RTA = 0.03 ms    rta=0.030000ms;3000.000000;5000.000000;0.000000 pl=0%;80;100;0
# HOST AND SERVICE PERFORMANCE DATA FILE TEMPLATES
# These options determine what data is written (and how) to the
# performance data files.  The templates may contain macros, special
# characters (\t for tab, \r for carriage return, \n for newline)
# and plain text.  A newline is automatically added after each write
# to the performance data file.  Some examples of what you can do are
# shown below.

#host_perfdata_file_template=[HOSTPERFDATA]\t$TIMET$\t$HOSTNAME$\t$HOSTEXECUTIONTIME$\t$HOSTOUTPUT$\t$HOSTPERFDATA$
#service_perfdata_file_template=[SERVICEPERFDATA]\t$TIMET$\t$HOSTNAME$\t$SERVICEDESC$\t$SERVICEEXECUTIONTIME$\t$SERVICELATENCY$\t$SERVICEOUTPUT$\t$SERVICEPERFDATA$
aj-jester commented 7 years ago

@Griesbacher I went through your tests and figured out Its a formatting issue. Nagflux is expecting Icinga's format it seems 😄 https://docs.icinga.com/icinga2/latest/doc/module/icinga2/chapter/addons

2017-03-28 13:53:53 Debug: metrics,host=localhost,service=Swap\ Usage,command=check_local_swap,performanceLabel=swap,warn-fill=none,crit-fill=none,unit=MB crit=0.0,min=0.0,max=991.0,value=981.0,warn=0.0 1490708747000
metrics,host=localhost,service=Current\ Users,command=check_local_users,performanceLabel=users,crit-fill=none,warn-fill=none value=2.0,warn=20.0,crit=50.0,min=0.0 1490708860000
2017-03-28 13:53:53 Debug: 204 No Content
2017-03-28 13:53:53 Debug: 204 No Content

If I make a PR with a documentation on how to configure this on Nagios end would you accept?

@misternobody In your nagios.cfg use the following settings and you should see data flow into influx.

host_perfdata_file_template=DATATYPE::HOSTPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tHOSTPERFDATA::$HOSTPERFDATA$\tHOSTCHECKCOMMAND::$HOSTCHECKCOMMAND$
service_perfdata_file_template=DATATYPE::SERVICEPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tSERVICEDESC::$SERVICEDESC$\tSERVICEPERFDATA::$SERVICEPERFDATA$\tSERVICECHECKCOMMAND::$SERVICECHECKCOMMAND$
misternobody commented 7 years ago

@aj-jester you are awesome! :) @Griesbacher thanks for help with it Now I can see data in InfluxDB and successfully integrated it with Grafana finally!

Griesbacher commented 7 years ago

@aj-jester If you would like to add a section to the README.md about the core config, that would be nice! I didn't think about such stuff because, we're only using OMD, where such is "standardized". That's why I always recommend to use it, because we have put a lot of effort in to avoid such errors ;) But I'm glad everything could be solved.

aj-jester commented 7 years ago

@Griesbacher No problem at all! done 👍