GoogleCloudPlatform / gcs-fuse-csi-driver

The Google Cloud Storage FUSE Container Storage Interface (CSI) Plugin.
Apache License 2.0
115 stars 28 forks source link

sidecar container prevents istio-validation init container from starting in GKE Autopilot cluster #322

Open zaphod72 opened 1 month ago

zaphod72 commented 1 month ago

GKE Autopilot Cluster. Rapid release channel - Cluster and Node at: 1.30.2-gke.1587003

Deployment is a Knative service. The Pod does not start with reason "Container istio-proxy is waiting".

Similar issues: https://github.com/GoogleCloudPlatform/gcs-fuse-csi-driver/issues/20 https://github.com/GoogleCloudPlatform/gcs-fuse-csi-driver/issues/53

As per https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/cloud-storage-fuse-csi-driver#pod-annotations the Pod annotations include:

        gke-gcsfuse/volumes: "true"
        proxy.istio.io/config: '{ "holdApplicationUntilProxyStarts": true }'
        traffic.sidecar.istio.io/excludeOutboundIPRanges: 169.254.169.254/32

Full Pod spec:

apiVersion: v1
items:
- apiVersion: v1
  kind: Pod
  metadata:
    annotations:
      autoscaling.knative.dev/maxScale: "5"
      autoscaling.knative.dev/metric: concurrency
      autoscaling.knative.dev/minScale: "1"
      autoscaling.knative.dev/scale-to-zero-grace-period: 5m
      autoscaling.knative.dev/scale-to-zero-pod-retention-period: 10m
      autoscaling.knative.dev/target: "10"
      gke-gcsfuse/volumes: "true"
      istio.io/rev: asm-managed-rapid
      k8s.v1.cni.cncf.io/networks: default/istio-cni
      kubectl.kubernetes.io/default-container: vllm-server
      kubectl.kubernetes.io/default-logs-container: vllm-server
      prometheus.io/path: /stats/prometheus
      prometheus.io/port: "15020"
      prometheus.io/scrape: "true"
      proxy.istio.io/config: '{ "holdApplicationUntilProxyStarts": true }'
      serving.knative.dev/creator: <...>
      sidecar.istio.io/interceptionMode: REDIRECT
      sidecar.istio.io/status: '{"initContainers":["istio-validation"],"containers":["istio-proxy"],"volumes":["workload-socket","credential-socket","workload-certs","istio-envoy","istio-data","istio-podinfo","istio-token"],"imagePullSecrets":null,"revision":"asm-managed-rapid"}'
      traffic.sidecar.istio.io/excludeInboundPorts: "15020"
      traffic.sidecar.istio.io/excludeOutboundIPRanges: 169.254.169.254/32
      traffic.sidecar.istio.io/includeInboundPorts: '*'
      traffic.sidecar.istio.io/includeOutboundIPRanges: '*'
    creationTimestamp: "2024-08-06T15:31:27Z"
    generateName: vllm-service-fea6900001db78d227f60296bb6cc1ab7e1110e-deployment-5876d5b87d-
    labels:
      ai.gke.io/inference-server: bookendinference
      ai.gke.io/model: Llama-2-7b-chat-hf-fea69000
      app: vllm-server
      app.kubernetes.io/managed-by: crfa-service
      examples.ai.gke.io/source: user-guide
      pod-template-hash: 5876d5b87d
      security.istio.io/tlsMode: istio
      service.istio.io/canonical-name: vllm-service-fea69000-cb1e-4c7e-8fcb-0d8ec8ba2a98
      service.istio.io/canonical-revision: vllm-service-fea69000-cb1e-4c7e-8fcb-0d8ec8ba2a98-00002
      serving.knative.dev/configuration: vllm-service-fea69000-cb1e-4c7e-8fcb-0d8ec8ba2a98
      serving.knative.dev/configurationGeneration: "2"
      serving.knative.dev/configurationUID: 818c4650-c5af-411d-9f5d-9e306678ac17
      serving.knative.dev/revision: vllm-service-fea69000-cb1e-4c7e-8fcb-0d8ec8ba2a98-00002
      serving.knative.dev/revisionUID: 021412de-565b-4ebb-8a12-edfff98974f2
      serving.knative.dev/service: vllm-service-fea69000-cb1e-4c7e-8fcb-0d8ec8ba2a98
      serving.knative.dev/serviceUID: 13f46e5f-16c6-4e70-9253-e526d72c5675
    name: vllm-service-fea6900001db78d227f60296bb6cc1ab7e1110e-deplo9fd8x
    namespace: bookendinference
    ownerReferences:
    - apiVersion: apps/v1
      blockOwnerDeletion: true
      controller: true
      kind: ReplicaSet
      name: vllm-service-fea6900001db78d227f60296bb6cc1ab7e1110e-deployment-5876d5b87d
      uid: 62268b11-c115-4959-a7a3-3c6788dda46d
    resourceVersion: "4630766"
    uid: bdeb75b1-727b-4abc-8f33-340f4a168170
  spec:
    containers:
    - args:
      - proxy
      - sidecar
      - --domain
      - $(POD_NAMESPACE).svc.cluster.local
      - --proxyLogLevel=warning
      - --proxyComponentLogLevel=misc:error
      - --log_output_level=default:info
      - --stsPort=15463
      env:
      - name: JWT_POLICY
        value: third-party-jwt
      - name: PILOT_CERT_PROVIDER
        value: system
      - name: CA_ADDR
        value: meshca.googleapis.com:443
      - name: POD_NAME
        valueFrom:
          fieldRef:
            apiVersion: v1
            fieldPath: metadata.name
      - name: POD_NAMESPACE
        valueFrom:
          fieldRef:
            apiVersion: v1
            fieldPath: metadata.namespace
      - name: INSTANCE_IP
        valueFrom:
          fieldRef:
            apiVersion: v1
            fieldPath: status.podIP
      - name: SERVICE_ACCOUNT
        valueFrom:
          fieldRef:
            apiVersion: v1
            fieldPath: spec.serviceAccountName
      - name: HOST_IP
        valueFrom:
          fieldRef:
            apiVersion: v1
            fieldPath: status.hostIP
      - name: ISTIO_CPU_LIMIT
        valueFrom:
          resourceFieldRef:
            divisor: "0"
            resource: limits.cpu
      - name: PROXY_CONFIG
        value: |
          {"discoveryAddress":"meshconfig.googleapis.com:443","proxyMetadata":{"CA_PROVIDER":"GoogleCA","CA_ROOT_CA":"/etc/ssl/certs/ca-certificates.crt","CA_TRUSTANCHOR":"","FLEET_PROJECT_NUMBER":"380486526345","GCP_METADATA":"darren2-dev-d0d0|380486526345|asm-cluster-v129-6|us-central1","OUTPUT_CERTS":"/etc/istio/proxy","PROXY_CONFIG_XDS_AGENT":"true","XDS_AUTH_PROVIDER":"gcp","XDS_ROOT_CA":"/etc/ssl/certs/ca-certificates.crt"},"meshId":"proj-380486526345","holdApplicationUntilProxyStarts":true}
      - name: ISTIO_META_POD_PORTS
        value: |-
          [
              {"name":"user-port","containerPort":8080,"protocol":"TCP"}
              ,{"name":"http-queueadm","containerPort":8022,"protocol":"TCP"}
              ,{"name":"http-autometric","containerPort":9090,"protocol":"TCP"}
              ,{"name":"http-usermetric","containerPort":9091,"protocol":"TCP"}
              ,{"name":"queue-port","containerPort":8012,"protocol":"TCP"}
              ,{"name":"https-port","containerPort":8112,"protocol":"TCP"}
          ]
      - name: ISTIO_META_APP_CONTAINERS
        value: vllm-server,queue-proxy
      - name: GOMEMLIMIT
        valueFrom:
          resourceFieldRef:
            divisor: "0"
            resource: limits.memory
      - name: GOMAXPROCS
        valueFrom:
          resourceFieldRef:
            divisor: "0"
            resource: limits.cpu
      - name: ISTIO_META_NODE_NAME
        valueFrom:
          fieldRef:
            apiVersion: v1
            fieldPath: spec.nodeName
      - name: ISTIO_META_INTERCEPTION_MODE
        value: REDIRECT
      - name: ISTIO_META_WORKLOAD_NAME
        value: vllm-service-fea6900001db78d227f60296bb6cc1ab7e1110e-deployment
      - name: ISTIO_META_OWNER
        value: kubernetes://apis/apps/v1/namespaces/bookendinference/deployments/vllm-service-fea6900001db78d227f60296bb6cc1ab7e1110e-deployment
      - name: ISTIO_META_MESH_ID
        value: proj-380486526345
      - name: TRUST_DOMAIN
        value: darren2-dev-d0d0.svc.id.goog
      - name: CA_PROVIDER
        value: GoogleCA
      - name: CA_ROOT_CA
        value: /etc/ssl/certs/ca-certificates.crt
      - name: CA_TRUSTANCHOR
      - name: FLEET_PROJECT_NUMBER
        value: "380486526345"
      - name: GCP_METADATA
        value: darren2-dev-d0d0|380486526345|asm-cluster-v129-6|us-central1
      - name: OUTPUT_CERTS
        value: /etc/istio/proxy
      - name: PROXY_CONFIG_XDS_AGENT
        value: "true"
      - name: XDS_AUTH_PROVIDER
        value: gcp
      - name: XDS_ROOT_CA
        value: /etc/ssl/certs/ca-certificates.crt
      - name: ISTIO_META_CLOUDRUN_ADDR
        value: asm-asm-cluster-v129-6-asm-managed-rf6fpcxrqg7-uxfkfeo4ja-uc.a.run.app:443
      - name: ISTIO_META_CLUSTER_ID
        value: cn-darren2-dev-d0d0-us-central1-asm-cluster-v129-6
      - name: ISTIO_META_ENABLE_MCP_LRS
        value: "true"
      - name: ISTIO_KUBE_APP_PROBERS
        value: '{"/app-health/queue-proxy/readyz":{"httpGet":{"path":"/","port":8012,"scheme":"HTTP","httpHeaders":[{"name":"K-Network-Probe","value":"queue"}]},"timeoutSeconds":1}}'
      image: gcr.io/gke-release/asm/proxyv2:1.19.10-asm.6
      imagePullPolicy: IfNotPresent
      lifecycle:
        postStart:
          exec:
            command:
            - pilot-agent
            - wait
      name: istio-proxy
      ports:
      - containerPort: 15090
        name: http-envoy-prom
        protocol: TCP
      readinessProbe:
        failureThreshold: 30
        httpGet:
          path: /healthz/ready
          port: 15021
          scheme: HTTP
        initialDelaySeconds: 1
        periodSeconds: 2
        successThreshold: 1
        timeoutSeconds: 3
      resources:
        limits:
          cpu: 500m
          memory: 512Mi
        requests:
          cpu: 500m
          memory: 512Mi
      securityContext:
        allowPrivilegeEscalation: false
        capabilities:
          drop:
          - ALL
        privileged: false
        readOnlyRootFilesystem: true
        runAsGroup: 1337
        runAsNonRoot: true
        runAsUser: 1337
      terminationMessagePath: /dev/termination-log
      terminationMessagePolicy: File
      volumeMounts:
      - mountPath: /var/run/secrets/workload-spiffe-uds
        name: workload-socket
      - mountPath: /var/run/secrets/credential-uds
        name: credential-socket
      - mountPath: /var/run/secrets/workload-spiffe-credentials
        name: workload-certs
      - mountPath: /var/lib/istio/data
        name: istio-data
      - mountPath: /etc/istio/proxy
        name: istio-envoy
      - mountPath: /var/run/secrets/tokens
        name: istio-token
      - mountPath: /etc/istio/pod
        name: istio-podinfo
      - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
        name: kube-api-access-5zsrg
        readOnly: true
    - args:
      - --model=/model-weights/models/fea69000-cb1e-4c7e-8fcb-0d8ec8ba2a98/model
      - --gpu-memory-utilization=0.95
      - --swap-space=0
      - --dtype=half
      - --tensor-parallel-size=1
      - --port=8080
      command:
      - python3
      - -m
      - vllm.entrypoints.openai.api_server
      env:
      - name: PROJECT_ID
        value: <...>
      - name: LLM_MODEL_ID
        value: fea69000-cb1e-4c7e-8fcb-0d8ec8ba2a98
      - name: PORT
        value: "8080"
      - name: K_REVISION
        value: vllm-service-fea69000-cb1e-4c7e-8fcb-0d8ec8ba2a98-00002
      - name: K_CONFIGURATION
        value: vllm-service-fea69000-cb1e-4c7e-8fcb-0d8ec8ba2a98
      - name: K_SERVICE
        value: vllm-service-fea69000-cb1e-4c7e-8fcb-0d8ec8ba2a98
      image: us-docker.pkg.dev/command-center-alpha/deployment/vllm@sha256:186b710188aa218fbbd6972fddca21d984d224a3289e5106fd960dde142babce
      imagePullPolicy: IfNotPresent
      lifecycle:
        preStop:
          httpGet:
            path: /wait-for-drain
            port: 8022
            scheme: HTTP
      name: vllm-server
      ports:
      - containerPort: 8080
        name: user-port
        protocol: TCP
      resources:
        limits:
          cpu: "4"
          ephemeral-storage: 100Gi
          memory: 100Gi
          nvidia.com/gpu: "1"
        requests:
          cpu: "4"
          ephemeral-storage: 100Gi
          memory: 100Gi
          nvidia.com/gpu: "1"
      securityContext:
        capabilities:
          drop:
          - NET_RAW
      terminationMessagePath: /dev/termination-log
      terminationMessagePolicy: FallbackToLogsOnError
      volumeMounts:
      - mountPath: /dev/shm
        name: dshm
      - mountPath: /model-weights
        name: model-weights
      - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
        name: kube-api-access-5zsrg
        readOnly: true
    - env:
      - name: SERVING_NAMESPACE
        value: bookendinference
      - name: SERVING_SERVICE
        value: vllm-service-fea69000-cb1e-4c7e-8fcb-0d8ec8ba2a98
      - name: SERVING_CONFIGURATION
        value: vllm-service-fea69000-cb1e-4c7e-8fcb-0d8ec8ba2a98
      - name: SERVING_REVISION
        value: vllm-service-fea69000-cb1e-4c7e-8fcb-0d8ec8ba2a98-00002
      - name: QUEUE_SERVING_PORT
        value: "8012"
      - name: QUEUE_SERVING_TLS_PORT
        value: "8112"
      - name: CONTAINER_CONCURRENCY
        value: "10"
      - name: REVISION_TIMEOUT_SECONDS
        value: "120"
      - name: REVISION_RESPONSE_START_TIMEOUT_SECONDS
        value: "0"
      - name: REVISION_IDLE_TIMEOUT_SECONDS
        value: "0"
      - name: SERVING_POD
        valueFrom:
          fieldRef:
            apiVersion: v1
            fieldPath: metadata.name
      - name: SERVING_POD_IP
        valueFrom:
          fieldRef:
            apiVersion: v1
            fieldPath: status.podIP
      - name: SERVING_LOGGING_CONFIG
      - name: SERVING_LOGGING_LEVEL
      - name: SERVING_REQUEST_LOG_TEMPLATE
        value: '{"httpRequest": {"requestMethod": "{{.Request.Method}}", "requestUrl":
          "{{js .Request.RequestURI}}", "requestSize": "{{.Request.ContentLength}}",
          "status": {{.Response.Code}}, "responseSize": "{{.Response.Size}}", "userAgent":
          "{{js .Request.UserAgent}}", "remoteIp": "{{js .Request.RemoteAddr}}", "serverIp":
          "{{.Revision.PodIP}}", "referer": "{{js .Request.Referer}}", "latency":
          "{{.Response.Latency}}s", "protocol": "{{.Request.Proto}}"}, "logging.googleapis.com/trace":
          "{{if ge (len (index .Request.Header "X-B3-Traceid")) 1}}{{index (index
          .Request.Header "X-B3-Traceid") 0}}{{else}}{{""}}{{end}}", "cloudrun": true}'
      - name: SERVING_ENABLE_REQUEST_LOG
        value: "false"
      - name: SERVING_REQUEST_METRICS_BACKEND
        value: opencensus
      - name: TRACING_CONFIG_BACKEND
        value: none
      - name: TRACING_CONFIG_ZIPKIN_ENDPOINT
      - name: TRACING_CONFIG_DEBUG
        value: "false"
      - name: TRACING_CONFIG_SAMPLE_RATE
        value: "0.1"
      - name: USER_PORT
        value: "8080"
      - name: SYSTEM_NAMESPACE
        value: knative-serving
      - name: METRICS_DOMAIN
        value: knative.dev/internal/serving
      - name: SERVING_READINESS_PROBE
        value: '{"tcpSocket":{"port":8080,"host":"127.0.0.1"},"successThreshold":1}'
      - name: ENABLE_PROFILING
        value: "false"
      - name: SERVING_ENABLE_PROBE_REQUEST_LOG
        value: "false"
      - name: METRICS_COLLECTOR_ADDRESS
        value: metrics-collector.knative-serving:55678
      - name: CONCURRENCY_STATE_ENDPOINT
      - name: CONCURRENCY_STATE_TOKEN_PATH
        value: /var/run/secrets/tokens/state-token
      - name: HOST_IP
        valueFrom:
          fieldRef:
            apiVersion: v1
            fieldPath: status.hostIP
      - name: ENABLE_HTTP2_AUTO_DETECTION
        value: "false"
      image: gcr.io/kf-releases/knative/queue@sha256:7e42510ddd79ab8a565a5c894535524d31e5360ff3d85d28a97b6a8813f701b6
      imagePullPolicy: IfNotPresent
      name: queue-proxy
      ports:
      - containerPort: 8022
        name: http-queueadm
        protocol: TCP
      - containerPort: 9090
        name: http-autometric
        protocol: TCP
      - containerPort: 9091
        name: http-usermetric
        protocol: TCP
      - containerPort: 8012
        name: queue-port
        protocol: TCP
      - containerPort: 8112
        name: https-port
        protocol: TCP
      readinessProbe:
        failureThreshold: 3
        httpGet:
          httpHeaders:
          - name: K-Network-Probe
            value: queue
          path: /app-health/queue-proxy/readyz
          port: 15020
          scheme: HTTP
        periodSeconds: 10
        successThreshold: 1
        timeoutSeconds: 1
      resources:
        requests:
          cpu: 25m
      securityContext:
        allowPrivilegeEscalation: false
        capabilities:
          drop:
          - all
        readOnlyRootFilesystem: true
        runAsNonRoot: true
      terminationMessagePath: /dev/termination-log
      terminationMessagePolicy: File
      volumeMounts:
      - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
        name: kube-api-access-5zsrg
        readOnly: true
    dnsPolicy: ClusterFirst
    enableServiceLinks: false
    initContainers:
    - args:
      - --v=5
      env:
      - name: NATIVE_SIDECAR
        value: "TRUE"
      image: gke.gcr.io/gcs-fuse-csi-driver-sidecar-mounter:v1.4.2-gke.0@sha256:80c2a52aaa16ee7d9956a4e4afb7442893919300af84ae445ced32ac758c55ad
      imagePullPolicy: IfNotPresent
      name: gke-gcsfuse-sidecar
      resources:
        requests:
          cpu: 250m
          ephemeral-storage: 5Gi
          memory: 256Mi
      securityContext:
        allowPrivilegeEscalation: false
        capabilities:
          drop:
          - ALL
        readOnlyRootFilesystem: true
        runAsGroup: 65534
        runAsNonRoot: true
        runAsUser: 65534
        seccompProfile:
          type: RuntimeDefault
      terminationMessagePath: /dev/termination-log
      terminationMessagePolicy: File
      volumeMounts:
      - mountPath: /gcsfuse-tmp
        name: gke-gcsfuse-tmp
      - mountPath: /gcsfuse-buffer
        name: gke-gcsfuse-buffer
      - mountPath: /gcsfuse-cache
        name: gke-gcsfuse-cache
      - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
        name: kube-api-access-5zsrg
        readOnly: true
    - args:
      - istio-iptables
      - -p
      - "15001"
      - -z
      - "15006"
      - -u
      - "1337"
      - -m
      - REDIRECT
      - -i
      - '*'
      - -x
      - 169.254.169.254/32
      - -b
      - '*'
      - -d
      - 15090,15021,15020
      - --log_output_level=default:info
      - --run-validation
      - --skip-rule-apply
      env:
      - name: CA_PROVIDER
        value: GoogleCA
      - name: CA_ROOT_CA
        value: /etc/ssl/certs/ca-certificates.crt
      - name: CA_TRUSTANCHOR
      - name: FLEET_PROJECT_NUMBER
        value: "380486526345"
      - name: GCP_METADATA
        value: darren2-dev-d0d0|380486526345|asm-cluster-v129-6|us-central1
      - name: OUTPUT_CERTS
        value: /etc/istio/proxy
      - name: PROXY_CONFIG_XDS_AGENT
        value: "true"
      - name: XDS_AUTH_PROVIDER
        value: gcp
      - name: XDS_ROOT_CA
        value: /etc/ssl/certs/ca-certificates.crt
      image: gcr.io/gke-release/asm/proxyv2:1.19.10-asm.6
      imagePullPolicy: IfNotPresent
      name: istio-validation
      resources:
        limits:
          cpu: 500m
          memory: 512Mi
        requests:
          cpu: 500m
          memory: 512Mi
      securityContext:
        allowPrivilegeEscalation: false
        capabilities:
          drop:
          - ALL
        privileged: false
        readOnlyRootFilesystem: true
        runAsGroup: 1337
        runAsNonRoot: true
        runAsUser: 1337
      terminationMessagePath: /dev/termination-log
      terminationMessagePolicy: File
      volumeMounts:
      - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
        name: kube-api-access-5zsrg
        readOnly: true
    nodeName: gk3-asm-cluster-v129-6-nap-wfsyex42-08977827-fxsw
    nodeSelector:
      cloud.google.com/gke-accelerator: nvidia-l4
      cloud.google.com/gke-accelerator-count: "1"
    preemptionPolicy: PreemptLowerPriority
    priority: 0
    restartPolicy: Always
    schedulerName: gke.io/optimize-utilization-scheduler
    securityContext:
      seccompProfile:
        type: RuntimeDefault
    serviceAccount: bookendinference
    serviceAccountName: bookendinference
    terminationGracePeriodSeconds: 120
    tolerations:
    - effect: NoSchedule
      key: kubernetes.io/arch
      operator: Equal
      value: amd64
    - effect: NoSchedule
      key: cloud.google.com/gke-accelerator
      operator: Equal
      value: nvidia-l4
    - effect: NoSchedule
      key: cloud.google.com/machine-family
      operator: Exists
    - effect: NoExecute
      key: node.kubernetes.io/not-ready
      operator: Exists
      tolerationSeconds: 300
    - effect: NoExecute
      key: node.kubernetes.io/unreachable
      operator: Exists
      tolerationSeconds: 300
    - effect: NoSchedule
      key: nvidia.com/gpu
      operator: Exists
    volumes:
    - emptyDir: {}
      name: gke-gcsfuse-tmp
    - emptyDir: {}
      name: gke-gcsfuse-buffer
    - emptyDir: {}
      name: gke-gcsfuse-cache
    - emptyDir: {}
      name: workload-socket
    - emptyDir: {}
      name: credential-socket
    - emptyDir: {}
      name: workload-certs
    - emptyDir:
        medium: Memory
      name: istio-envoy
    - emptyDir: {}
      name: istio-data
    - downwardAPI:
        defaultMode: 420
        items:
        - fieldRef:
            apiVersion: v1
            fieldPath: metadata.labels
          path: labels
        - fieldRef:
            apiVersion: v1
            fieldPath: metadata.annotations
          path: annotations
      name: istio-podinfo
    - name: istio-token
      projected:
        defaultMode: 420
        sources:
        - serviceAccountToken:
            audience: darren2-dev-d0d0.svc.id.goog
            expirationSeconds: 43200
            path: istio-token
    - emptyDir:
        medium: Memory
      name: dshm
    - name: model-weights
      persistentVolumeClaim:
        claimName: gcs-fuse-csi-static-pvc
    - name: kube-api-access-5zsrg
      projected:
        defaultMode: 420
        sources:
        - serviceAccountToken:
            expirationSeconds: 3607
            path: token
        - configMap:
            items:
            - key: ca.crt
              path: ca.crt
            name: kube-root-ca.crt
        - downwardAPI:
            items:
            - fieldRef:
                apiVersion: v1
                fieldPath: metadata.namespace
              path: namespace
  status:
    conditions:
    - lastProbeTime: null
      lastTransitionTime: "2024-08-06T15:34:34Z"
      status: "True"
      type: PodReadyToStartContainers
    - lastProbeTime: null
      lastTransitionTime: "2024-08-06T15:34:31Z"
      message: 'containers with incomplete status: [gke-gcsfuse-sidecar istio-validation]'
      reason: ContainersNotInitialized
      status: "False"
      type: Initialized
    - lastProbeTime: null
      lastTransitionTime: "2024-08-06T15:34:31Z"
      message: 'containers with unready status: [istio-proxy vllm-server queue-proxy]'
      reason: ContainersNotReady
      status: "False"
      type: Ready
    - lastProbeTime: null
      lastTransitionTime: "2024-08-06T15:34:31Z"
      message: 'containers with unready status: [istio-proxy vllm-server queue-proxy]'
      reason: ContainersNotReady
      status: "False"
      type: ContainersReady
    - lastProbeTime: null
      lastTransitionTime: "2024-08-06T15:34:31Z"
      status: "True"
      type: PodScheduled
    containerStatuses:
    - image: gcr.io/gke-release/asm/proxyv2:1.19.10-asm.6
      imageID: ""
      lastState: {}
      name: istio-proxy
      ready: false
      restartCount: 0
      started: false
      state:
        waiting:
          reason: PodInitializing
    - image: gcr.io/kf-releases/knative/queue@sha256:7e42510ddd79ab8a565a5c894535524d31e5360ff3d85d28a97b6a8813f701b6
      imageID: ""
      lastState: {}
      name: queue-proxy
      ready: false
      restartCount: 0
      started: false
      state:
        waiting:
          reason: PodInitializing
    - image: us-docker.pkg.dev/command-center-alpha/deployment/vllm@sha256:186b710188aa218fbbd6972fddca21d984d224a3289e5106fd960dde142babce
      imageID: ""
      lastState: {}
      name: vllm-server
      ready: false
      restartCount: 0
      started: false
      state:
        waiting:
          reason: PodInitializing
    hostIP: 10.128.0.31
    initContainerStatuses:
    - containerID: containerd://cd3065e71cf9fc170a94b2776d27526ef0dfe83a5e8233fac1c323a8ed538204
      image: sha256:a5773a634b88b649e0268ca813f6418db2595b5538e6f5398e0c5639ca675751
      imageID: gke.gcr.io/gcs-fuse-csi-driver-sidecar-mounter@sha256:80c2a52aaa16ee7d9956a4e4afb7442893919300af84ae445ced32ac758c55ad
      lastState: {}
      name: gke-gcsfuse-sidecar
      ready: false
      restartCount: 0
      started: true
      state:
        running:
          startedAt: "2024-08-06T15:34:33Z"
    - image: gcr.io/gke-release/asm/proxyv2:1.19.10-asm.6
      imageID: ""
      lastState: {}
      name: istio-validation
      ready: false
      restartCount: 0
      started: false
      state:
        waiting:
          reason: PodInitializing
    phase: Pending
    podIP: 10.82.128.5
    podIPs:
    - ip: 10.82.128.5
    qosClass: Burstable
    startTime: "2024-08-06T15:34:31Z"
kind: List
metadata:
  resourceVersion: ""

GCSFuse container logs:

Running Google Cloud Storage FUSE CSI driver sidecar mounter version v1.4.2-gke.0
connecting to socket "/gcsfuse-tmp/.volumes/gcs-fuse-csi-pv/socket"
get the underlying socket
calling recvmsg...
parsing SCM...
parsing SCM_RIGHTS...
gcsfuse config file content: map[cache-dir:/gcsfuse-cache/.volumes/gcs-fuse-csi-pv file-cache:map[cache-file-for-range-read:true max-size-mb:-1] logging:map[file-path:/dev/fd/1 format:json severity:warning]]
start to mount bucket "darren2-dev-d0d0" for volume "gcs-fuse-csi-pv"
gcsfuse mounting with args [--temp-dir /gcsfuse-buffer/.volumes/gcs-fuse-csi-pv/temp-dir --config-file /gcsfuse-tmp/.volumes/gcs-fuse-csi-pv/config.yaml --implicit-dirs --app-name gke-gcs-fuse-csi --foreground --uid 0 --gid 0 darren2-dev-d0d0 /dev/fd/3]...
waiting for SIGTERM signal...
gcsfuse for bucket "darren2-dev-d0d0", volume "gcs-fuse-csi-pv" started with process id 19
songjiaxun commented 1 month ago

The gke-gcsfuse-sidecar is a native sidecar container, which should be an init container with restartPolicy: Always. But for some reason, the restartPolicy is missing from the spec. There may be other webhooks configured on this cluster that removed the restartPolicy.

Can you share the cluster ID with me? You can get the ID by running gcloud container clusters describe <cluster-name> --location <cluster-location> | grep id:, and share the id with me? Thanks!

zaphod72 commented 1 month ago

Thanks @songjiaxun Cluster id: 049c60badca8467abfa1901253886a0e9c543c4b71d549439fb968273a2751e4

songjiaxun commented 1 month ago

Checked the Pod creation audit log using the following query:

"vllm-service-fea6900001db78d227f60296bb6cc1ab7e1110e-deplo9fd8x"
"pods.create"
logName="projects/darren2-dev-d0d0/logs/cloudaudit.googleapis.com%2Factivity"

I see the sidecar gke-gcsfuse-sidecar was modified by the knative webhook -- the webhook did remove the restartPolicy: Always:

patch.webhook.admission.k8s.io/round_1_index_5: "{"configuration":"istio-inject.webhook.crfa.internal.knative.dev","webhook":"istio-inject.webhook.crfa.internal.knative.dev","patch":[{"op":"remove","path":"/spec/initContainers/0/restartPolicy"}],"patchType":"JSONPatch"}"

We are seeing similar issues from other users. The cause of this issue is that the native sidecar feature is not recognized by some webhooks, so the incompatible webhook will remove the restartPolicy: Always from the init sidecar container, making it block the regular container initialization.

Workaround 1

A quick workaround is to add a new node pool using 1.28 nodes to the cluster. You can use the smallest node size, and just add one node. Then redeploy your workload. The webhook will inject the gke-gcsfuse-sidecar container as a regular container. You don't need to change your workload spec. Note that this new node will be charged, unfortunately.

gcloud container --project "<your-project>" node-pools create "pool-dummy" --cluster "<your-cluster-name>" --location "<your-cluster-location>" --node-version "1.28" --machine-type "e2-micro" --image-type "COS_CONTAINERD" --disk-type "pd-standard" --disk-size "10" --num-nodes "1"

Workaround 2

You can manually inject the gke-gcsfuse-sidecar container into your workload as a regular container, and also add three auxiliary volumes. Meanwhile, please remove the annotation gke-gcsfuse/volumes: "true". Then re-deploy your workload.

apiVersion: v1
kind: Pod
metadata:
  name: test
  annotations:
    # gke-gcsfuse/volumes: "true" <- remove this annotation
spec:
  containers:
  # add the gke-gcsfuse-sidecar BEFORE your workload container
  - args:
    - --v=5
    image: gke.gcr.io/gcs-fuse-csi-driver-sidecar-mounter:v1.4.2-gke.0@sha256:80c2a52aaa16ee7d9956a4e4afb7442893919300af84ae445ced32ac758c55ad
    imagePullPolicy: IfNotPresent
    name: gke-gcsfuse-sidecar
    resources:
      requests:
        cpu: 250m
        ephemeral-storage: 5Gi
        memory: 256Mi
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
      readOnlyRootFilesystem: true
      runAsGroup: 65534
      runAsNonRoot: true
      runAsUser: 65534
      seccompProfile:
        type: RuntimeDefault
    volumeMounts:
    - mountPath: /gcsfuse-tmp
      name: gke-gcsfuse-tmp
    - mountPath: /gcsfuse-buffer
      name: gke-gcsfuse-buffer
    - mountPath: /gcsfuse-cache
      name: gke-gcsfuse-cache
  - name: your-workload
  ...
  volumes:
  # add following three volumes
  - emptyDir: {}
    name: gke-gcsfuse-tmp
  - emptyDir: {}
    name: gke-gcsfuse-buffer
  - emptyDir: {}
    name: gke-gcsfuse-cache

Long-term fix

We are actively working on fixing this issue.

zaphod72 commented 1 month ago

Thank you - workaround 2 - inject the gke-gcsfuse-sidecar container as a regular container, is working :)

tatsuya-yokoyama commented 3 weeks ago

Hi, I encountered the same issue, and I was able to solve it using workaround 2 mentioned here, thanks.

I have a quick question. Could you give me reasons why workaround 1 use v1.28? I understand that the native sidecar container feature was introduced at v1.28 (ref). So we should use more older version for workaround 1 I just thought. FYI: My service worked without issues at v1.28. But after upgrading to v1.29, I encountered this issue. Datadog webhook removed restartPolicy: Always.