bloomberg / goldpinger

Debugging tool for Kubernetes which tests and displays connectivity between nodes in the cluster.
Apache License 2.0
2.53k stars 180 forks source link

Defining multiple http_targets or tcp_targets crashes the UI #144

Open cdpiazza opened 6 months ago

cdpiazza commented 6 months ago

Describe the bug When using multiple urls on the http_targets env var or when doing the same with the tcp_targets, the UI shows all nodes as unhealty and the /check displays a "context-deadline-exceeded". Nevertheless the metrics appear correct to the urls being checked. Pods and containers start correctly, no error log provided, but tool UI does not works.

To Reproduce Steps to reproduce the behavior:

  1. Set the env var HTTP_TARGETS with more than 2 values.
  2. Open the Web UI'
  3. See error

Expected behavior The UI show the http probes Screenshots ui-red check-exceeded metrics

Environment (please complete the following information):

Additional context DaemonSet definition:

apiVersion: v1
items:
- apiVersion: apps/v1
  kind: DaemonSet
  metadata:
    annotations:
    creationTimestamp: "2024-05-03T20:51:00Z"
    generation: 7
    labels:
      app: goldpinger
    name: goldpinger
    namespace: default
    resourceVersion: "3620"
    uid: c52262b0-d182-43c1-a35f-f83aa18510e7
  spec:
    revisionHistoryLimit: 10
    selector:
      matchLabels:
        app: goldpinger
    template:
      metadata:
        annotations:
          prometheus.io/port: "8080"
          prometheus.io/scrape: "true"
        creationTimestamp: null
        labels:
          app: goldpinger
      spec:
        containers:
        - env:
          - name: HOST
            value: 0.0.0.0
          - name: PORT
            value: "8080"
          - name: HOSTNAME
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: spec.nodeName
          - name: POD_IP
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: status.podIP
          - name: HTTP_TARGETS
            value: http://www.google.com http://www.bloomberg.com http://www.cloudflare.com
          image: bloomberg/goldpinger:latest
          imagePullPolicy: Always
          livenessProbe:
            failureThreshold: 3
            httpGet:
              path: /healthz
              port: 8080
              scheme: HTTP
            initialDelaySeconds: 20
            periodSeconds: 5
            successThreshold: 1
            timeoutSeconds: 1
          name: goldpinger
          ports:
          - containerPort: 8080
            name: http
            protocol: TCP
          readinessProbe:
            failureThreshold: 3
            httpGet:
              path: /healthz
              port: 8080
              scheme: HTTP
            initialDelaySeconds: 20
            periodSeconds: 5
            successThreshold: 1
            timeoutSeconds: 1
          resources:
            limits:
              memory: 80Mi
            requests:
              cpu: 1m
              memory: 40Mi
          securityContext:
            allowPrivilegeEscalation: false
            readOnlyRootFilesystem: true
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
        dnsPolicy: ClusterFirst
        restartPolicy: Always
        schedulerName: default-scheduler
        securityContext:
          fsGroup: 2000
          runAsNonRoot: true
          runAsUser: 1000
        serviceAccount: goldpinger-serviceaccount
        serviceAccountName: goldpinger-serviceaccount
        terminationGracePeriodSeconds: 30
        tolerations:
        - effect: NoSchedule
          key: node-role.kubernetes.io/master
    updateStrategy:
      rollingUpdate:
        maxSurge: 0
        maxUnavailable: 100%
      type: RollingUpdate