apache / trafficcontrol

Apache Traffic Control is an Open Source implementation of a Content Delivery Network
https://trafficcontrol.apache.org/
Apache License 2.0
1.06k stars 340 forks source link

MonitorConfigPoller panic: runtime error: invalid memory address or nil pointer dereference #2250

Closed smalenfant closed 5 years ago

smalenfant commented 6 years ago

Traffic Monitor crash and exited. I don't have much details except that procNetDev was not returning from astats json output.

From traffic_monitor.log:

ERROR: cache.go:325: 2018-04-16T17:08:34.339643848Z: precomputeAstats psp6cdedge03 handle precomputing outbytes 'procNetDev empty'
ERROR: stat.go:87: 2018-04-16T17:08:34.351132976Z: stat poll getting vitals for psp6cdedge03: Error parsing procnetdev: no fields found
ERROR: cache.go:325: 2018-04-16T17:08:42.098836427Z: precomputeAstats psp6cdedge03 handle precomputing outbytes 'procNetDev empty'
ERROR: stat.go:87: 2018-04-16T17:08:42.10609049Z: stat poll getting vitals for psp6cdedge03: Error parsing procnetdev: no fields found
ERROR: asm_amd64.s:2197: 2018-04-18T01:31:26.283370864Z: MonitorConfigPoller: getting monitor config map: Get https://cdn1cdcms0001.coxlab.net/api/1.2/cdns/cdn1/configs/monitoring.json: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
 <nil>
ERROR: asm_amd64.s:514: 2018-04-28T05:30:01.733534769Z: MonitorConfigPoller panic: runtime error: invalid memory address or nil pointer dereference
rob05c commented 6 years ago

We've seen this before, but very rarely, and only in the lab.

This was added, to dump the stacktrace when it happens, to tell us where it is: https://github.com/apache/incubator-trafficcontrol/pull/1894 https://github.com/apache/incubator-trafficcontrol/commit/8227159

But we haven't seen it since that was added. If possible, can you apply that changeset? If you can get it to panic with that changeset, it should tell us where the panic is so we can fix it.

mitchell852 commented 6 years ago

@smalenfant ^^

rob05c commented 6 years ago

Hopefully fixed by https://github.com/apache/trafficcontrol/pull/2377 but confirmation would be good.

rob05c commented 5 years ago

I'm closing this, assuming #2377 fixed. Feel free to reopen if you see this issue again.