kubernetes-retired / heapster

[EOL] Compute Resource Usage Analysis and Monitoring of Container Clusters
Apache License 2.0
2.63k stars 1.25k forks source link

NPE on Graphite sink #2024

Closed integrii closed 6 years ago

integrii commented 6 years ago

Description

It looks like when heapster went to startup, it was unable to setup a graphite sink, which resulted in an NPE.

I0425 21:24:15.289715       1 heapster.go:78] /heapster -v=3 --source=kubernetes:https://kubernetes.default --sink=influxdb:http://monitoring-influxdb.kube-system.svc:8086 --sink=graphite:tcp://relay.asg.company.com:2003?prefix=k8s.asg --sink=graphite:tcp://relay.asg.company.com:2003?prefix=k8s.asg.default
I0425 21:24:15.289743       1 heapster.go:79] Heapster version v1.5.2
I0425 21:24:15.289884       1 configs.go:61] Using Kubernetes client with master "https://kubernetes.default" and version v1
I0425 21:24:15.289916       1 configs.go:62] Using kubelet port 10255
I0425 21:24:15.320675       1 reflector.go:187] Starting reflector *v1.Node (1h0m0s) from k8s.io/heapster/metrics/util/util.go:30
I0425 21:24:15.320803       1 reflector.go:236] Listing and watching *v1.Node from k8s.io/heapster/metrics/util/util.go:30
I0425 21:24:20.328314       1 influxdb.go:312] created influxdb sink with options: host:monitoring-influxdb.kube-system.svc:8086 user:root db:k8s
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0x15f7423]

goroutine 1 [running]:
k8s.io/heapster/metrics/sinks.(*SinkFactory).BuildAll(0xc420749cd8, 0xc4202b1d40, 0x3, 0x4, 0x0, 0x0, 0x7f393be10000, 0x0, 0x0, 0x0, ...)
    /go/src/k8s.io/heapster/metrics/sinks/factory.go:90 +0x563
main.createAndInitSinksOrDie(0xc4202b1d40, 0x3, 0x4, 0x0, 0x0, 0x4a817c800, 0x100, 0x0, 0xc420435480, 0x0, ...)
    /go/src/k8s.io/heapster/metrics/heapster.go:194 +0x8d
main.main()
    /go/src/k8s.io/heapster/metrics/heapster.go:89 +0x458

I humbly suggest a quick review of this code in the stack trace to ensure that NPEs do not happen and errors are gracefully handled.

Output of heapster --version:

1.5.2
integrii commented 6 years ago

I explicitly take issue with this line on factory.go line 44: func (this *SinkFactory) Build(uri flags.Uri) (core.DataSink, error) {

In go, its best to return concrete types. Accept interfaces, but return concrete types. Because a interface was returned here, it became hard to return "nothing" in the case of an error. Because of that, the developer chose to return a nil. Because a nil was returned, the code blew up somewhere.

Looking at the codebase now, it looks like this case is handled. I am assuming this was fixed in master but is not in the image for Heapster 1.5.2...