upmc-enterprises / elasticsearch-operator

manages elasticsearch clusters
Other
655 stars 133 forks source link

Resources should be created with a label that links them to the controller #207

Open hagmonk opened 6 years ago

hagmonk commented 6 years ago

I tried using the operator in my cluster, and was stumped by the following crash:

time="2018-05-23T02:07:40Z" level=info msg="Found 0 existing clusters "
time="2018-05-23T02:07:40Z" level=info msg="Watching for elasticsearch events..."
panic: runtime error: slice bounds out of range

goroutine 31 [running]:
github.com/upmc-enterprises/elasticsearch-operator/vendor/github.com/upmc-enterprises/elasticsearch-operator/pkg/processor.(*Processor).processPodEvent(0xc420284220, 0xc4208477b0, 0x0, 0x0)
        /Users/hagmonk/go/src/github.com/upmc-enterprises/elasticsearch-operator/vendor/github.com/upmc-enterprises/elasticsearch-operator/pkg/processor/processor.go:248 +0x268

It seems that (to me, a non-Go programmer), the problem starts here in k8sutil.go#L289. Updates are followed for any pod that has the label "role=data", which is kind of a broad set of labels.

In particular, another user of my cluster had deployed Elasticsearch manually, and some of those pods had the labels "component=elasticsearch,role=data", which ran afoul of processor.go#L248.

The prometheus operator seems to use a dedicated set of labels that ties the resources it creates to a particular instance of the operator:

kubectl  get pods prometheus-k8s-0 --show-labels
NAME               READY     STATUS    RESTARTS   AGE       LABELS
prometheus-k8s-0   2/2       Running   1          19d       app=prometheus,controller-revision-hash=prometheus-k8s-7bdb64d8b9,prometheus=k8s,statefulset.kubernetes.io/pod-name=prometheus-k8s-0

Should the elasticsearch-operator adopt the same pattern, so that the controller only attempts to manage resources whose provenance is the operator itself?

stevesloka commented 6 years ago

Yes excellent suggestion @hagmonk

gianrubio commented 6 years ago

This could be covered in v2 spec #184

FaKod commented 6 years ago

Same issue here. We run several pods having role=data So the operator is unusable then? Or can I do something?