argoproj / argo-events

Event-driven Automation Framework for Kubernetes
https://argoproj.github.io/argo-events/
Apache License 2.0
2.37k stars 738 forks source link

null pointer dereference leads to seg violation in `controllers/sensor` package #254

Closed jpugliesi closed 5 years ago

jpugliesi commented 5 years ago

Describe the bug The controllers/sensor package's MarkNodePhase function does not validate that a returned node pointer variable is not null before accessing the pointer's fields

In the case outlined below in To Reproduce, this results in the following runtime error (segmentation violation):

2019-03-27T16:16:17Z | info  | msg: sensor state updated successfully |  phase:Active sensor-name:calendar-sensor
2019-03-27T16:16:17Z | info  | msg: successfully persisted sensor resource update and created K8s event |  sensor-name:calendar-sensor
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x48 pc=0xfa6231]

goroutine 8 [running]:
github.com/argoproj/argo-events/controllers/sensor.MarkNodePhase(0xc4200d6000, 0xc42057b960, 0x1e, 0x145bf2a, 0x7, 0x145d017, 0x8, 0x0, 0xc42032c3d0, 0xc420973ac8, ...)
        /Users/vpage/go/src/github.com/argoproj/argo-events/controllers/sensor/state.go:112 +0x71
github.com/argoproj/argo-events/sensors.(*sensorExecutionCtx).processTriggers(0xc42032c380)
        /Users/vpage/go/src/github.com/argoproj/argo-events/sensors/trigger.go:129 +0x77e
github.com/argoproj/argo-events/sensors.(*sensorExecutionCtx).processUpdateNotification(0xc42032c380, 0xc4202d2440)
        /Users/vpage/go/src/github.com/argoproj/argo-events/sensors/event-handler.go:111 +0x97d
github.com/argoproj/argo-events/sensors.(*sensorExecutionCtx).WatchEventsFromGateways.func1(0xc42032c380)
        /Users/vpage/go/src/github.com/argoproj/argo-events/sensors/event-handler.go:186 +0x5a
created by github.com/argoproj/argo-events/sensors.(*sensorExecutionCtx).WatchEventsFromGateways
        /Users/vpage/go/src/github.com/argoproj/argo-events/sensors/event-handler.go:184 +0x54

To Reproduce Steps to reproduce the behavior:

  1. Create calendar gateway and sensor according to the calendar documentation
  2. Modify the sensor spec to use image argoproj/sensor:v0.8.5-test, and a git trigger source referencing a private repository:
    apiVersion: argoproj.io/v1alpha1
    kind: Sensor
    metadata:
    name: calendar-sensor
    labels:
    sensors.argoproj.io/sensor-controller-instanceid: argo-events
    spec:
    template:
    spec:
      containers:
        - name: "sensor"
          image: "argoproj/sensor:v0.8.5-test"
          imagePullPolicy: Always
      serviceAccountName: argo-events-sa
    dependencies:
    - name: "calendar-gateway:interval"
    eventProtocol:
    type: "HTTP"
    http:
      port: "9300"
    triggers:
    - template:
        name: calendar-workflow-trigger
        group: argoproj.io
        version: v1alpha1
        kind: Workflow
        source:
          git:
            url: "https://github.com/example/private-repository.git"
            cloneDirectory: "/git/"
            namespace: argo-events
            creds:
              username:
                name: github-secret
                key: username
              password:
                name: github-secret
                key: password
            filePath: "example/workflow.yaml"

Expected behavior First off, I don't have enough context to understand why the sensor.Stats.Nodes map within GetNodeByName would not contain a nodeId.

Still, MarkNodePhase should check if the value returned by GetNodeByName is null before accessing values, and somehow gracefully handle the case where it is in fact null. I'm not sure what an appropriate solution would be, but perhaps the fix would have the following structure:

node := GetNodeByName(sensor, nodeName)

// Check that nod is not null:
if node == null {
        // gracefully handle case where node could not be found by name
}

if node.Phase != phase {
    log.Info().Str("type", string(node.Type)).Str("node-name", node.Name).Str("phase", string(node.Phase))
    node.Phase = phase
}

Additional context This error does not occur when running the same reproduction steps on v0.8.3. Regardless, it seems important to handle the case where a node can't be found by name (possibly by validating the node pointer is not null before accessing its fields.)

VaibhavPage commented 5 years ago

Thanks for the suggestion. I will add the nil check. Although as you pointed out, the sensor should have the correct node with a nodeId. I was testing v0.8.5-test with username and password and it works as expected.