Griesbacher / nagflux

A connector which copies performancedata from Nagios / Icinga(2) / Naemon to InfluxDB
GNU General Public License v2.0
65 stars 30 forks source link

panic: runtime error: index out of range #14

Closed d-zalewski closed 7 years ago

d-zalewski commented 8 years ago

Naglfux crashes and fails to process host perfdata. Occasionaly I'm getting:

doesn't contain all of these fields: [table time]

servmon1:root:/etc/nagflux> go version go version go1.5.1 linux/amd64

Nagios 4.1.1

I've attached my naglux config and host perfdata file it failed to process.

config.gcfg.txt host-perfdata.1473250552.txt

Nagios perfdata definition and process command:

perfdata.cfg.txt

2016-09-07 13:15:59 Info: Nagios Spoolfile Folder: /var/spool
2016-09-07 13:15:59 Info: Nagflux Spoolfile Folder: /var/spool/nagflux
panic: runtime error: index out of range

goroutine 68 [running]:
github.com/griesbacher/nagflux/collector/nagflux.FileCollector.parseFile(0xc82023e070, 0xc8200d1b60, 0xc82000f4c0, 0x12, 0xc8200d1b30, 0x26, 0xc820166570, 0x22, 0x0, 0x0, ...)
        /root/gorepo/src/github.com/griesbacher/nagflux/collector/nagflux/nagfluxFileCollector.go:90 +0x149b
github.com/griesbacher/nagflux/collector/nagflux.FileCollector.run(0xc82023e070, 0xc8200d1b60, 0xc82000f4c0, 0x12, 0xc8200d1b30, 0x26)
        /root/gorepo/src/github.com/griesbacher/nagflux/collector/nagflux/nagfluxFileCollector.go:54 +0x23c
created by github.com/griesbacher/nagflux/collector/nagflux.NewNagfluxFileCollector
        /root/gorepo/src/github.com/griesbacher/nagflux/collector/nagflux/nagfluxFileCollector.go:34 +0x108

goroutine 1 [runnable]:
main.main()
        /root/gorepo/src/github.com/griesbacher/nagflux/main.go:128 +0x14c7

goroutine 17 [syscall, locked to thread]:
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1696 +0x1

goroutine 5 [syscall]:
os/signal.loop()
        /usr/lib/golang/src/os/signal/signal_unix.go:22 +0x18
created by os/signal.init.1
        /usr/lib/golang/src/os/signal/signal_unix.go:28 +0x37

goroutine 19 [select]:
github.com/griesbacher/nagflux/target/influx.Worker.run(0x1, 0xc820176180, 0xc8201682a0, 0xc8200db4a0, 0xc82006acd0, 0x41, 0xc82016c320, 0x13, 0x7fc7588742f0, 0xc82016c300, ...)
        /root/gorepo/src/github.com/griesbacher/nagflux/target/influx/Worker.go:84 +0x53a
created by github.com/griesbacher/nagflux/target/influx.WorkerGenerator.func1
        /root/gorepo/src/github.com/griesbacher/nagflux/target/influx/Worker.go:63 +0x4fa

goroutine 18 [select]:
github.com/griesbacher/nagflux/target/influx.Worker.run(0x0, 0xc820176120, 0xc820168230, 0xc8200db4a0, 0xc82006acd0, 0x41, 0xc82016c2e0, 0x13, 0x7fc7588742f0, 0xc82016c300, ...)
        /root/gorepo/src/github.com/griesbacher/nagflux/target/influx/Worker.go:84 +0x53a
created by github.com/griesbacher/nagflux/target/influx.WorkerGenerator.func1
        /root/gorepo/src/github.com/griesbacher/nagflux/target/influx/Worker.go:63 +0x4fa

goroutine 20 [chan receive]:
github.com/griesbacher/nagflux/target/influx.(*Connector).run(0xc820120000)
        /root/gorepo/src/github.com/griesbacher/nagflux/target/influx/Connector.go:140 +0x4c
created by github.com/griesbacher/nagflux/target/influx.ConnectorFactory
        /root/gorepo/src/github.com/griesbacher/nagflux/target/influx/Connector.go:86 +0xbeb

goroutine 35 [select]:
github.com/griesbacher/nagflux/collector/livestatus.Collector.run(0xc82015e0e0, 0xc8200d1b60, 0xc820150210, 0xc8200d1b30, 0x8d17c0, 0x73)
        /root/gorepo/src/github.com/griesbacher/nagflux/collector/livestatus/Collector.go:100 +0x1b9
created by github.com/griesbacher/nagflux/collector/livestatus.NewLivestatusCollector
        /root/gorepo/src/github.com/griesbacher/nagflux/collector/livestatus/Collector.go:85 +0x257

goroutine 36 [select]:
github.com/griesbacher/nagflux/collector/livestatus.(*CacheBuilder).run(0xc820150540, 0x6fc23ac00)
        /root/gorepo/src/github.com/griesbacher/nagflux/collector/livestatus/CacheBuilder.go:68 +0x228
created by github.com/griesbacher/nagflux/collector/livestatus.NewLivestatusCacheBuilder
        /root/gorepo/src/github.com/griesbacher/nagflux/collector/livestatus/CacheBuilder.go:50 +0x180

goroutine 66 [select]:
github.com/griesbacher/nagflux/collector/spoolfile.(*NagiosSpoolfileWorker).run(0xc820228080)
        /root/gorepo/src/github.com/griesbacher/nagflux/collector/spoolfile/nagiosSpoolfileWorker.go:72 +0xc95
created by github.com/griesbacher/nagflux/collector/spoolfile.NagiosSpoolfileWorkerGenerator.func1
        /root/gorepo/src/github.com/griesbacher/nagflux/collector/spoolfile/nagiosSpoolfileWorker.go:56 +0x79

goroutine 67 [syscall]:
syscall.Syscall(0xd9, 0x5, 0xc820245000, 0x1000, 0x10, 0x441bc5, 0x715c40)
        /usr/lib/golang/src/syscall/asm_linux_amd64.s:18 +0x5
syscall.Getdents(0x5, 0xc820245000, 0x1000, 0x1000, 0x64, 0x0, 0x0)
        /usr/lib/golang/src/syscall/zsyscall_linux_amd64.go:508 +0x5f
syscall.ReadDirent(0x5, 0xc820245000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
        /usr/lib/golang/src/syscall/syscall_linux.go:770 +0x4d
os.(*File).readdirnames(0xc820238008, 0xffffffffffffffff, 0xc820256000, 0x0, 0x64, 0x0, 0x0)
        /usr/lib/golang/src/os/dir_unix.go:39 +0x215
os.(*File).Readdirnames(0xc820238008, 0xffffffffffffffff, 0x0, 0x0, 0x0, 0x0, 0x0)
        /usr/lib/golang/src/os/doc.go:134 +0x85
os.(*File).readdir(0xc820238008, 0xffffffffffffffff, 0x0, 0x0, 0x0, 0x0, 0x0)
        /usr/lib/golang/src/os/file_unix.go:179 +0xb3
os.(*File).Readdir(0xc820238008, 0xffffffffffffffff, 0x0, 0x0, 0x0, 0x0, 0x0)
        /usr/lib/golang/src/os/doc.go:115 +0x85
io/ioutil.ReadDir(0xc8200b4db0, 0xa, 0x0, 0x0, 0x0, 0x0, 0x0)
        /usr/lib/golang/src/io/ioutil/ioutil.go:105 +0xcc
github.com/griesbacher/nagflux/collector/spoolfile.(*NagiosSpoolfileCollector).run(0xc820228040)
        /root/gorepo/src/github.com/griesbacher/nagflux/collector/spoolfile/nagiosSpoolfileCollector.go:61 +0x2d3
created by github.com/griesbacher/nagflux/collector/spoolfile.NagiosSpoolfileCollectorFactory
        /root/gorepo/src/github.com/griesbacher/nagflux/collector/spoolfile/nagiosSpoolfileCollector.go:38 +0x21b

goroutine 69 [select, locked to thread]:
runtime.gopark(0x8ce7e0, 0xc8201adf28, 0x825770, 0x6, 0x42f118, 0x2)
        /usr/lib/golang/src/runtime/proc.go:185 +0x163
runtime.selectgoImpl(0xc8201adf28, 0x0, 0x18)
        /usr/lib/golang/src/runtime/select.go:392 +0xa64
runtime.selectgo(0xc8201adf28)
        /usr/lib/golang/src/runtime/select.go:212 +0x12
runtime.ensureSigM.func1()
        /usr/lib/golang/src/runtime/signal1_unix.go:227 +0x353
runtime.goexit()
        /usr/lib/golang/src/runtime/asm_amd64.s:1696 +0x1

goroutine 70 [chan receive]:
main.main.func2(0xc820212180, 0xc82000fec0, 0xc820150240, 0xc820150540, 0xc820228040, 0xc82022a0c0, 0xc8200d1b60)
        /root/gorepo/src/github.com/griesbacher/nagflux/main.go:119 +0x48
created by main.main
        /root/gorepo/src/github.com/griesbacher/nagflux/main.go:124 `+0x13fe
Griesbacher commented 8 years ago

If the file which is referred in the log message is the file you uploaded, it seams the perfdata file is interpreted as nagflux file. Maybe the file is stored in the wrong folders. You could try to separate the NagiosSpoolfileFolder, and the NagfluxSpoolfileFolder which are currently within each other.

d-zalewski commented 8 years ago

I've changed the folder location as suggested and I can see nagflux picksup the data but I'm getting following errors back from influxdb

I got golang-github-influxdb-influxdb-client-0.8.5-0.1.git9485e99.el6.x86_64 installed and influxdb server is runing 0.13. Do I need a newer version of a client?

2016-09-08 11:53:08 Warn: Influx status: 400 Bad Request - {"error":"partial write:\nunable to parse 'metrics,host=SRV1,service=Disk\\ Usage\\ -\\ Other\\ Drives,command=check_nrpe,performanceLabel=D:\\\\,warn-fill=none,crit-fill=none,unit=G value=9.72899,warn=28.498,crit=29.398,min=0.0,max=29.998 1473331470000': invalid tag format\nunable to parse 'metrics,host=SRV2,service=Disk\\ Usage\\ -\\ Other\\ Drives,command=check_nrpe,performanceLabel=E:\\\\,warn-fill=none,crit-fill=none,unit=G max=99.997,value=80.906,warn=94.997,crit=97.997,min=0.0 1473331470000': invalid tag format\nunable to parse 'metrics,host=SRV2,service=Disk\\ Usage\\ -\\ Other\\ Drives,command=check_nrpe,performanceLabel=F:\\\\,crit-fill=none,warn-fill=none,unit=G value=3.198,warn=18.997,crit=19.597,min=0.0,max=19.997 1473331470000': invalid tag format\nunable to parse 'metrics,host=SRV3,service=Disk\\ Usage\\ -\\ Other\\ Drives,command=check_nrpe,performanceLabel=E:\\\\,warn-fill=none,crit-fill=none,unit=M warn=9726.096,crit=10033.236,min=0.0,max=10237.996,value=160.063 1473331470000': invalid tag format\nunable to parse 'metrics,host=SRV4,service=Disk\\ Usage\\ -\\ Other\\ Drives,command=check_nrpe,performanceLabel=D:\\\\,warn-fill=none,crit-fill=none,unit=G value=2.955,warn=56.998,crit=58.798,min=0.0,max=59.998 1473331470000': invalid tag format\nunable to parse 'met

Also in the nagflux error dump file I got:

One of the values is not clean..
metrics,host=SRV1,service=Disk\ Usage\ -\ Other\ Drives,command=check_nrpe,performanceLabel=E:\\,warn-fill=none,crit-fill=none,unit=M value=171.457,warn=9726.096,crit=10033.236,min=0.0,max=10237.996 1473256540000
metrics,host=SRV2,service=Disk\ Usage\ -\ Other\ Drives,command=check_nrpe,performanceLabel=D:\\,crit-fill=none,warn-fill=none,unit=G crit=58.798,min=0.0,max=59.998,value=14.575,warn=56.998 1473256540000
metrics,host=SRV3,service=Disk\ Usage\ -\ Other\ Drives,command=check_nrpe,performanceLabel=E:\\,warn-fill=none,crit-fill=none,unit=G warn=56.997,crit=58.797,min=0.0,max=59.997,value=52.464 1473256540000
metrics,host=SRV4,service=Disk\ Usage\ -\ Other\ Drives,command=check_nrpe,performanceLabel=D:\\,crit-fill=none,warn-fill=none,unit=M warn=58366.096,crit=60209.236,min=0.0,max=61437.996,value=511.996 1473256540000
Griesbacher commented 8 years ago

Btw. the influxdb go client is not necessary for nagflux. This is an Influxdb bug, the current Influxdb can not handle backspaces as last char in a tag: https://github.com/influxdata/influxdb/issues/6672 - 'performanceLabel=E:\' e.g.

But the rest of your data should be in the database

d-zalewski commented 8 years ago

Apparently it was fixed and merged to master on 26th of May.

https://github.com/influxdata/influxdb/commit/6f25d97de4cc1328241807423f0467bc5ef03e55

Its included in the changelog for 1.0 https://github.com/influxdata/influxdb/blob/master/CHANGELOG.md

I just tried 1.0GA from https://dl.influxdata.com/influxdb/releases/influxdb-1.0.0.x86_64.rpm and still getting same errors.

Backslash issue was fixed back in 0.9.3 https://github.com/influxdata/influxdb/issues/3704

metrics,host=srv1,service=Disk\ Usage\ -\ Other\ Drives,command=check_nrpe,performanceLabel=E:\\,warn-fill=none,crit-fill=none,unit=G value=65.712,warn=104.497,crit=107.797,min=0.0,max=109.997 1473331480000

In addition to that I'm getting:

2016-09-08 15:46:25 Warn: NagiosSpoolfileCollector: Could not write to buffer
2016-09-08 15:46:25 Warn: NagiosSpoolfileWorker: Could not write to buffer

2016-09-08 15:45:47 Warn: Post http://10.x.x.x:8086/write?precision=ms&db=nagflux: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
2016-09-08 15:45:47 Warn: Post http://10.x.x.x:8086/write?precision=ms&db=nagflux: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
Griesbacher commented 7 years ago

I noted the wrong InfluxDB issue, in your case there's no trailing whitespace it's the trailing backslash and this bug is still open: https://github.com/influxdata/influxdb/issues/6008.

Regarding your second problem with the timeouts, I can't tell you anything it seams that your connection to the Influxdb seems to be too slow? That could be anything.

Griesbacher commented 7 years ago

I'll close that one due to inactivity