GoogleCloudPlatform / metacontroller

Lightweight Kubernetes controllers as a service
https://metacontroller.app/
Apache License 2.0
792 stars 105 forks source link

[bug] Metacontroller crash when one of the CompositeControllers misbehave #191

Open grzesuav opened 4 years ago

grzesuav commented 4 years ago

Recently I introduced bug in one of CompositeControllers, which were handled by metacontroller. In general, during processing metacontroller's request my controller throws an exception, which was not caught. In a result metacontroller starts to throw a lot of errors and has been practically corrupted (it cannot handle any of other, working controllers)

E1126 15:07:02.792048       1 runtime.go:66] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
/go/src/metacontroller.app/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:72
/go/src/metacontroller.app/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:65
/go/src/metacontroller.app/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:51
/usr/local/go/src/runtime/asm_amd64.s:573
/usr/local/go/src/runtime/panic.go:502
/usr/local/go/src/runtime/panic.go:63
/usr/local/go/src/runtime/signal_unix.go:388
/go/src/metacontroller.app/vendor/k8s.io/apimachinery/pkg/apis/meta/v1/unstructured/unstructured.go:195
/go/src/metacontroller.app/controller/common/common.go:42
/go/src/metacontroller.app/controller/common/common.go:137
/go/src/metacontroller.app/controller/composite/controller.go:443
/go/src/metacontroller.app/controller/composite/controller.go:417
/go/src/metacontroller.app/controller/composite/controller.go:221
/go/src/metacontroller.app/controller/composite/controller.go:210
/go/src/metacontroller.app/controller/composite/controller.go:187
/go/src/metacontroller.app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133
/go/src/metacontroller.app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134
/go/src/metacontroller.app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88
/go/src/metacontroller.app/controller/composite/controller.go:187
/usr/local/go/src/runtime/asm_amd64.s:2361
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
    panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x91cd5a]
goroutine 932 [running]:
metacontroller.app/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
    /go/src/metacontroller.app/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:58 +0x107
panic(0xfd53c0, 0x187bff0)
    /usr/local/go/src/runtime/panic.go:502 +0x229
metacontroller.app/vendor/k8s.io/apimachinery/pkg/apis/meta/v1/unstructured.(*Unstructured).GetAPIVersion(0x0, 0xc42430d7a0, 0x4122d8)
    /go/src/metacontroller.app/vendor/k8s.io/apimachinery/pkg/apis/meta/v1/unstructured/unstructured.go:195 +0x3a
metacontroller.app/controller/common.ChildMap.Insert(0xc4208b4600, 0x1228820, 0xc42f966d80, 0x0)
    /go/src/metacontroller.app/controller/common/common.go:42 +0x32
metacontroller.app/controller/common.MakeChildMap(0x1228820, 0xc42f966d80, 0xc421cf62e0, 0x2, 0x4, 0x0)
    /go/src/metacontroller.app/controller/common/common.go:137 +0x5f
metacontroller.app/controller/composite.(*parentController).syncParentObject(0xc420477440, 0xc42f966d80, 0x0, 0xc42fbd39e0)
    /go/src/metacontroller.app/controller/composite/controller.go:443 +0x31a
metacontroller.app/controller/composite.(*parentController).sync(0xc420477440, 0xc42fbd39e0, 0x14, 0xf84f40, 0xc42d9f6410)
    /go/src/metacontroller.app/controller/composite/controller.go:417 +0x2a3
metacontroller.app/controller/composite.(*parentController).processNextWorkItem(0xc420477440, 0xc42446aa00)
    /go/src/metacontroller.app/controller/composite/controller.go:221 +0xec
metacontroller.app/controller/composite.(*parentController).worker(0xc420477440)
    /go/src/metacontroller.app/controller/composite/controller.go:210 +0x2b
metacontroller.app/controller/composite.(*parentController).(metacontroller.app/controller/composite.worker)-fm()
    /go/src/metacontroller.app/controller/composite/controller.go:187 +0x2a
metacontroller.app/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc4204077b0)
    /go/src/metacontroller.app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x54
metacontroller.app/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc42430dfb0, 0x3b9aca00, 0x0, 0x1, 0xc4204e6b40)
    /go/src/metacontroller.app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134 +0xbd
metacontroller.app/vendor/k8s.io/apimachinery/pkg/util/wait.Until(0xc4204077b0, 0x3b9aca00, 0xc4204e6b40)
    /go/src/metacontroller.app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88 +0x4d
metacontroller.app/controller/composite.(*parentController).Start.func1.1(0xc4278a8310, 0xc420477440)
    /go/src/metacontroller.app/controller/composite/controller.go:187 +0x7d
created by metacontroller.app/controller/composite.(*parentController).Start.func1
    /go/src/metacontroller.app/controller/composite/controller.go:185 +0x60c
stuart-warren commented 4 years ago

what would have been the response? a 50x error with a text stack trace, or a json document?

grzesuav commented 4 years ago

not sure, wasn't be able to observe. In general my python server has exception during generation of response, should ve easy to write a reproduce case. Will try to do this once I find a while

AmitKumarDas commented 4 years ago

@grzesuav Can you also provide the metcontroller spec/yaml & if possible the web hook logic?

Kyrremann commented 4 years ago

I get the same error, but my app dosen't misbehave.

Error:

E0401 14:09:57.206784       1 runtime.go:66] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
/go/src/metacontroller.app/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:72
/go/src/metacontroller.app/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:65
/go/src/metacontroller.app/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:51
/usr/local/go/src/runtime/asm_amd64.s:573
/usr/local/go/src/runtime/panic.go:502
/usr/local/go/src/runtime/panic.go:63
/usr/local/go/src/runtime/signal_unix.go:388
/go/src/metacontroller.app/vendor/k8s.io/apimachinery/pkg/apis/meta/v1/unstructured/unstructured.go:195
/go/src/metacontroller.app/controller/common/common.go:42
/go/src/metacontroller.app/controller/common/common.go:137
/go/src/metacontroller.app/controller/decorator/controller.go:470
/go/src/metacontroller.app/controller/decorator/controller.go:423
/go/src/metacontroller.app/controller/decorator/controller.go:243
/go/src/metacontroller.app/controller/decorator/controller.go:232
/go/src/metacontroller.app/controller/decorator/controller.go:207
/go/src/metacontroller.app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133
/go/src/metacontroller.app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134
/go/src/metacontroller.app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88
/go/src/metacontroller.app/controller/decorator/controller.go:207
/usr/local/go/src/runtime/asm_amd64.s:2361
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
    panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x91cd5a]

goroutine 2307 [running]:
metacontroller.app/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
    /go/src/metacontroller.app/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:58 +0x107
panic(0xfd53c0, 0x187bff0)
    /usr/local/go/src/runtime/panic.go:502 +0x229
metacontroller.app/vendor/k8s.io/apimachinery/pkg/apis/meta/v1/unstructured.(*Unstructured).GetAPIVersion(0x0, 0xc42053f340, 0x4122d8)
    /go/src/metacontroller.app/vendor/k8s.io/apimachinery/pkg/apis/meta/v1/unstructured/unstructured.go:195 +0x3a
metacontroller.app/controller/common.ChildMap.Insert(0xc420765f20, 0x1228820, 0xc4203f4150, 0x0)
    /go/src/metacontroller.app/controller/common/common.go:42 +0x32
metacontroller.app/controller/common.MakeChildMap(0x1228820, 0xc4203f4150, 0xc4201ab240, 0x1, 0x4, 0x0)
    /go/src/metacontroller.app/controller/common/common.go:137 +0x5f
metacontroller.app/controller/decorator.(*decoratorController).syncParentObject(0xc4208c2cb0, 0xc4203f4150, 0x0, 0xc42078600e)
    /go/src/metacontroller.app/controller/decorator/controller.go:470 +0x84b
metacontroller.app/controller/decorator.(*decoratorController).sync(0xc4208c2cb0, 0xc420786000, 0x21, 0xf84f40, 0xc4205a62b0)
    /go/src/metacontroller.app/controller/decorator/controller.go:423 +0x21e
metacontroller.app/controller/decorator.(*decoratorController).processNextWorkItem(0xc4208c2cb0, 0xc4200c2e00)
    /go/src/metacontroller.app/controller/decorator/controller.go:243 +0xec
metacontroller.app/controller/decorator.(*decoratorController).worker(0xc4208c2cb0)
    /go/src/metacontroller.app/controller/decorator/controller.go:232 +0x2b
metacontroller.app/controller/decorator.(*decoratorController).(metacontroller.app/controller/decorator.worker)-fm()
    /go/src/metacontroller.app/controller/decorator/controller.go:207 +0x2a
metacontroller.app/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc4209247b0)
    /go/src/metacontroller.app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x54
metacontroller.app/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc42053ffb0, 0x3b9aca00, 0x0, 0x1, 0xc4203ce3c0)
    /go/src/metacontroller.app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134 +0xbd
metacontroller.app/vendor/k8s.io/apimachinery/pkg/util/wait.Until(0xc4209247b0, 0x3b9aca00, 0xc4203ce3c0)
    /go/src/metacontroller.app/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88 +0x4d
metacontroller.app/controller/decorator.(*decoratorController).Start.func1.1(0xc420a52210, 0xc4208c2cb0)
    /go/src/metacontroller.app/controller/decorator/controller.go:207 +0x7d
created by metacontroller.app/controller/decorator.(*decoratorController).Start.func1
    /go/src/metacontroller.app/controller/decorator/controller.go:205 +0x6ca

My yaml

apiVersion: metacontroller.k8s.io/v1alpha1
kind: DecoratorController
metadata:
  name: gpr-syncer
spec:
  resources:
  - apiVersion: v1
    resource: namespaces
    annotationSelector:
      matchExpressions:
      - {key: gpr-secrets-synced, operator: DoesNotExist}
      - {key: owner, operator: Exists}
  hooks:
    sync:
      webhook:
        url: http://gpr-syncer.metacontroller/sync
        timeout: 10s
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: gpr-syncer
  namespace: metacontroller
  labels:
    app: gpr-syncer
spec:
  replicas: 1
  selector:
    matchLabels:
      app: gpr-syncer
  template:
    metadata:
      labels:
        app: gpr-syncer
    spec:
      containers:
      - name: gpr-syncer
        image: gpr-syncer-18
        ports:
        - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: gpr-syncer
  namespace: metacontroller
spec:
  selector:
    app: gpr-syncer
  ports:
  - port: 80
    targetPort: 80

And my app:

import logging
import socket

from fastapi import FastAPI, Request

logger = logging.getLogger('gunicorn.error')
app = FastAPI()

def createSecret(namespace):
    secret = {}
    secret['apiVersion'] = 'v1'
    secret['kind'] = 'Secret'
    secret['metadata'] = {}
    secret['metadata']['annotations'] = {'gpr-synced': 'true'}
    secret['metadata']['name'] = 'gpr-credentials'
    secret['metadata']['namespace'] = namespace
    secret['data'] = {'.dockerconfigjson': 'token'}

@app.get("/")
async def index():
    return {"message": "Hello from {}".format(socket.gethostname())}

@app.post("/sync")
async def sync(request: Request):
    json = await request.json()
    namespace = json['object']['metadata']['name']
    if namespace == "kyrre-havik-eriksen":
      #return {'annotations': {'gpr-secret-synced': 'true'}, 'attachments': [createSecret(namespace)]}
      return {'attachments': [createSecret(namespace)]}
    else:
      return {}
Kyrremann commented 4 years ago

For me it turns out I had forgotten to return the new secret-dict from createSecret-method. That fixed the problem for me.

grzesuav commented 4 years ago

Ah, I forgot to post my example, will try to add it. Nevertheless the metcontroller should not panic.