platform9 / luigi

The plumber you'll hire to install all your Kubernetes network plumbing
Apache License 2.0
21 stars 3 forks source link

Added ability to customize hostplumber metrics port #211

Closed manasabsv26 closed 1 month ago

manasabsv26 commented 2 months ago

ISSUE(S):

PMK-6484

SUMMARY:

Added the ability to customize hostplumber metrics port by changing the value of METRICS_BIND_ADDRESS in hostplumber-manager-config configmap

TESTING:

With these changes, built a new luigi and hostplumber image, and tested these images Pods came up fine:

luigi-system           cert-manager-9dfd8cdc6-fkwnw                1/1     Running   0          5h7m
luigi-system           cert-manager-cainjector-cc668794-kkqtd      1/1     Running   0          5h7m
luigi-system           cert-manager-webhook-68447b9c99-47t7g       1/1     Running   0          5h7m
luigi-system           hostplumber-controller-manager-7ffjm        2/2     Running   0          69s
luigi-system           hostplumber-controller-manager-zxdrz        2/2     Running   0          68s
luigi-system           luigi-controller-manager-794f54794c-w4qbg   2/2     Running   0          2m16s

Events of luigi-controller-manager-794f54794c-w4qbg pod:

Events:
  Type    Reason     Age    From               Message
  ----    ------     ----   ----               -------
  Normal  Scheduled  4m26s  default-scheduler  Successfully assigned luigi-system/luigi-controller-manager-794f54794c-w4qbg to 10.149.106.212
  Normal  Pulled     4m26s  kubelet            Container image "gcr.io/kubebuilder/kube-rbac-proxy:v0.11.0" already present on machine
  Normal  Created    4m26s  kubelet            Created container kube-rbac-proxy
  Normal  Started    4m26s  kubelet            Started container kube-rbac-proxy
  Normal  Pulled     4m26s  kubelet            Container image "docker.io/manasab26/luigi-plugins:private-master-manasa-PMK-6484-pmk-3299898" already present on machine
  Normal  Created    4m25s  kubelet            Created container manager
  Normal  Started    4m25s  kubelet            Started container manager

Events of hostplumber-controller-manager-zxdrz pod:

Events:
  Type    Reason     Age    From               Message
  ----    ------     ----   ----               -------
  Normal  Scheduled  2m28s  default-scheduler  Successfully assigned luigi-system/hostplumber-controller-manager-7ffjm to 10.149.106.63
  Normal  Pulled     2m28s  kubelet            Container image "gcr.io/kubebuilder/kube-rbac-proxy:v0.8.0" already present on machine
  Normal  Created    2m28s  kubelet            Created container kube-rbac-proxy
  Normal  Started    2m28s  kubelet            Started container kube-rbac-proxy
  Normal  Pulled     2m28s  kubelet            Container image "docker.io/manasab26/hostplumber:private-master-manasa-PMK-6484" already present on machine
  Normal  Created    2m28s  kubelet            Created container manager
  Normal  Started    2m28s  kubelet            Started container manager

Environment details in describe of one of the hostplumber-controller-manager pods

Environment:
      K8S_NODE_NAME:          (v1:spec.nodeName)
      K8S_NAMESPACE:         luigi-system (v1:metadata.namespace)
      METRICS_BIND_ADDRESS:  <set to the key 'METRICS_BIND_ADDRESS' of config map 'hostplumber-manager-config'>  Optional: true

Daemonset:

kc get ds hostplumber-controller-manager -n luigi-system -o yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  annotations:
    deprecated.daemonset.template.generation: "1"
  creationTimestamp: "2024-07-01T13:33:35Z"
  generation: 1
  labels:
    control-plane: controller-manager
  name: hostplumber-controller-manager
...
        - name: K8S_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        - name: METRICS_BIND_ADDRESS
          valueFrom:
            configMapKeyRef:
              key: METRICS_BIND_ADDRESS
              name: hostplumber-manager-config
              optional: true

hostplumber-manager-config cm:

kc get cm hostplumber-manager-config  -o yaml -n luigi-system
apiVersion: v1
data:
  METRICS_BIND_ADDRESS: 127.0.0.1:8080
  controller_manager_config.yaml: |
    apiVersion: controller-runtime.sigs.k8s.io/v1alpha1
    kind: ControllerManagerConfig
    health:
      healthProbeBindAddress: :8081
    webhook:
      port: 9443
    leaderElection:
      leaderElect: true
      resourceName: 52f205ce.k8s.pf9.io
kind: ConfigMap
metadata:
  creationTimestamp: "2024-07-01T13:00:40Z"
  name: hostplumber-manager-config
  namespace: luigi-system
  resourceVersion: "48979"
  uid: 71cd7bcd-6065-4241-8d5c-ceef8488c3db

Changed METRICS_BIND_ADDRESS to 127.0.0.1:8085

apiVersion: v1
data:
  METRICS_BIND_ADDRESS: 127.0.0.1:8085
  controller_manager_config.yaml: |
    apiVersion: controller-runtime.sigs.k8s.io/v1alpha1
    kind: ControllerManagerConfig

Deleted the hostplumber-controller-manager pods after making this configMap change, then when new pods came up, checked their logs

Logs ofhostplumber-controller-manager-zxdrz pod:

2024-07-01T13:34:16Z    INFO    controller-runtime.metrics  Metrics server is starting to listen    {"addr": "127.0.0.1:8085"}
2024-07-01T13:34:16Z    INFO    setup   starting manager
2024-07-01T13:34:16Z    INFO    Starting server {"path": "/metrics", "kind": "metrics", "addr": "127.0.0.1:8085"}
2024-07-01T13:34:16Z    INFO    Starting server {"kind": "health probe", "addr": "[::]:8081"}
2024-07-01T13:34:16Z    INFO    Starting EventSource    {"controller": "hostnetworktemplate", "controllerGroup": "plumber.k8s.pf9.io", "controllerKind": "HostNetworkTemplate", "source": "kind source: *v1.HostNetworkTemplate"}
2024-07-01T13:34:16Z    INFO    Starting Controller {"controller": "hostnetworktemplate", "controllerGroup": "plumber.k8s.pf9.io", "controllerKind": "HostNetworkTemplate"}
2024-07-01T13:34:16Z    INFO    Starting workers    {"controller": "hostnetworktemplate", "controllerGroup": "plumber.k8s.pf9.io", "controllerKind": "HostNetworkTemplate", "worker count": 1}

Could see that it's now using newer metric-bind-address, as provided in configmap

manasabsv26 commented 2 months ago

Testing details with new code changes to add validation:

  1. Edited the METRIC_BIND_ADRESS value to "random" in configMap

    apiVersion: v1
    data:
    METRICS_BIND_ADDRESS: random
    controller_manager_config.yaml: |
    apiVersion: controller-runtime.sigs.k8s.io/v1alpha1

    hostplumber-controller-manager pods came in error state with these logs:

    kc logs -n luigi-system           hostplumber-controller-manager-vsn9m 
    2024-07-08T08:00:15Z    ERROR   setup   METRICS_BIND_ADDRESS is invalid {"error": "address must be in the format IP:port"}
    main.main
    /workspace/main.go:104
    runtime.main
    /usr/local/go/src/runtime/proc.go:271
  2. Edited the port value to 77777 in configMap:

    apiVersion: v1
    data:
    METRICS_BIND_ADDRESS: 127.0.0.1:77777
    controller_manager_config.yaml: |
    apiVersion: [controller-runtime.sigs.k8s.io/v1alpha1](http://controller-runtime.sigs.k8s.io/v1alpha1)
    kind: ControllerManagerConfig
luigi-system           cert-manager-webhook-68447b9c99-v9nn6       1/1     Running     0            143m
luigi-system           hostplumber-controller-manager-fmv5f        1/2     Error       1 (2s ago)   3s
luigi-system           hostplumber-controller-manager-nrh67        1/2     Error       1 (2s ago)   3s

hosplumber-controller-manager pod logs

kc logs -n luigi-system           hostplumber-controller-manager-fmv5f
2024-07-08T07:50:01Z    ERROR   setup   METRICS_BIND_ADDRESS is invalid {"error": "port must be between 1 and 65535"}
main.main
    /workspace/main.go:104
runtime.main
    /usr/local/go/src/runtime/proc.go:271
manasabsv26 commented 2 months ago

After adding ipv6 format support in validation ( format - [ipv6 addr]:port), tested following cases:

  1. Valid ipv6 address METRIC_BIND_ADDRESS: '[::1]:9000'

    apiVersion: v1
    data:
    METRICS_BIND_ADDRESS: '[::1]:9000'
    controller_manager_config.yaml: |
    apiVersion: controller-runtime.sigs.k8s.io/v1alpha1

    logs:

    kc logs -n luigi-system           hostplumber-controller-manager-jffkx  
    2024-07-08T12:02:47Z    INFO    controller-runtime.metrics  Metrics server is starting to listen    {"addr": "[::1]:9000"}
    2024-07-08T12:02:47Z    INFO    setup   starting manager
    2024-07-08T12:02:47Z    INFO    Starting server {"path": "/metrics", "kind": "metrics", "addr": "[::1]:9000"}
    2024-07-08T12:02:47Z    INFO    Starting server {"kind": "health probe", "addr": "[::]:8081"}
    2024-07-08T12:02:47Z    INFO    Starting EventSource    {"controller": "hostnetworktemplate", "controllerGroup": "plumber.k8s.pf9.io", "controllerKind": "HostNetworkTemplate", "source": "kind source: *v1.HostNetworkTemplate"}
    2024-07-08T12:02:47Z    INFO    Starting Controller {"controller": "hostnetworktemplate", "controllerGroup": "plumber.k8s.pf9.io", "controllerKind": "HostNetworkTemplate"}
    2024-07-08T12:02:47Z    INFO    Starting workers    {"controller": "hostnetworktemplate", "controllerGroup": "plumber.k8s.pf9.io", "controllerKind": "HostNetworkTemplate", "worker count": 1}
  2. Invalid ip address METRIC_BIND_ADDRESS: fe80::1:9000

    apiVersion: v1
    data:
    METRICS_BIND_ADDRESS: fe80::1:9000
    controller_manager_config.yaml: |
    apiVersion: controller-runtime.sigs.k8s.io/v1alpha1

logs:

kc logs -n luigi-system           hostplumber-controller-manager-pr2xl
2024-07-08T11:41:32Z    ERROR   setup   METRICS_BIND_ADDRESS is invalid {"error": "address must be in the format IP:port"}
main.main
    /workspace/main.go:122
runtime.main
    /usr/local/go/src/runtime/proc.go:271
  1. Valid ipv6 address, but invalid port METRIC_BIND_ADDRESS: '[::1]:77777'
    apiVersion: v1
    data:
    METRICS_BIND_ADDRESS: '[::1]:77777'
    controller_manager_config.yaml: |
    apiVersion: controller-runtime.sigs.k8s.io/v1alpha1

    logs:

    kc logs -n luigi-system           hostplumber-controller-manager-cx2b9
    2024-07-08T12:09:28Z    ERROR   setup   METRICS_BIND_ADDRESS is invalid {"error": "port must be between 1 and 65535"}
    main.main
    /workspace/main.go:122
    runtime.main
    /usr/local/go/src/runtime/proc.go:271
  2. Invalid ipv4 address METRIC_BIND_ADDRESS: random:9000 logs:
    kc logs -n luigi-system           hostplumber-controller-manager-n5nh5
    2024-07-08T12:11:34Z    ERROR   setup   METRICS_BIND_ADDRESS is invalid {"error": "invalid IP address"}
    main.main
    /workspace/main.go:122
    runtime.main
    /usr/local/go/src/runtime/proc.go:271
  3. Valid ipv4 address METRIC_BIND_ADDRESS: 127.0.0.1:9000
    kc logs -n luigi-system           hostplumber-controller-manager-c48z9 
    2024-07-08T12:13:09Z    INFO    controller-runtime.metrics  Metrics server is starting to listen    {"addr": "127.0.0.1:9000"}
    2024-07-08T12:13:09Z    INFO    setup   starting manager
    2024-07-08T12:13:09Z    INFO    Starting server {"path": "/metrics", "kind": "metrics", "addr": "127.0.0.1:9000"}
    2024-07-08T12:13:09Z    INFO    Starting server {"kind": "health probe", "addr": "[::]:8081"}
    2024-07-08T12:13:09Z    INFO    Starting EventSource    {"controller": "hostnetworktemplate", "controllerGroup": "plumber.k8s.pf9.io", "controllerKind": "HostNetworkTemplate", "source": "kind source: *v1.HostNetworkTemplate"}
manasabsv26 commented 1 month ago

Removed validation function as discussed with @mithilarun. ctrl.NewManager should take care of validating the metrics address.