m-lab / etl

M-Lab ingestion pipeline
Apache License 2.0
22 stars 7 forks source link

Prod deployment failed with new k8s parser rule #931

Open gfr10598 opened 4 years ago

gfr10598 commented 4 years ago

Looks like etl-travis-deploy@ needs to have cloud-kubernetes-deployer (custom?) role.

+set -e
+set -u
+USAGE='/home/travis/gopath/src/github.com/m-lab/etl/travis/kubectl.sh <project> <cluster> <command>'
+PROJECT=mlab-oti
+CLUSTER=data-processing
+shift 2
+source /home/travis/google-cloud-sdk/path.bash.inc
+++command readlink /home/travis/google-cloud-sdk/path.bash.inc
+++readlink /home/travis/google-cloud-sdk/path.bash.inc
++script_link=
++script_link=/home/travis/google-cloud-sdk/path.bash.inc
++apparent_sdk_dir=/home/travis/google-cloud-sdk
++'[' /home/travis/google-cloud-sdk == /home/travis/google-cloud-sdk/path.bash.inc ']'
+++command cd -P /home/travis/google-cloud-sdk
+++cd -P /home/travis/google-cloud-sdk
+++command pwd -P
+++pwd -P
++sdk_dir=/home/travis/google-cloud-sdk
++bin_path=/home/travis/google-cloud-sdk/bin
++[[ :/home/travis/.rvm/gems/ruby-2.4.9/bin:/home/travis/.rvm/gems/ruby-2.4.9@global/bin:/home/travis/.rvm/rubies/ruby-2.4.9/bin:/home/travis/.rvm/bin:/home/travis/google-cloud-sdk/bin:/home/travis/gopath/bin:/home/travis/.gimme/versions/go1.13.8.linux.amd64/bin:/home/travis/bin:/home/travis/bin:/home/travis/.local/bin:/usr/local/lib/jvm/openjdk11/bin:/opt/pyenv/shims:/home/travis/.phpenv/shims:/home/travis/perl5/perlbrew/bin:/home/travis/.nvm/versions/node/v10.16.0/bin:/home/travis/.kiex/elixirs/elixir-1.7.4/bin:/home/travis/.kiex/bin:/home/travis/gopath/bin:/home/travis/.gimme/versions/go1.11.1.linux.amd64/bin:/usr/local/maven-3.6.3/bin:/usr/local/cmake-3.12.4/bin:/usr/local/clang-7.0.0/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin:/home/travis/.phpenv/bin:/opt/pyenv/bin:/home/travis/.yarn/bin:/home/travis/gopath/bin: != *\:\/\h\o\m\e\/\t\r\a\v\i\s\/\g\o\o\g\l\e\-\c\l\o\u\d\-\s\d\k\/\b\i\n\:* ]]
++dirname /home/travis/gopath/src/github.com/m-lab/etl/travis/kubectl.sh
+source /home/travis/gopath/src/github.com/m-lab/etl/travis/gcloudlib.sh
+KEYNAME=SERVICE_ACCOUNT_mlab_oti
+activate_service_account SERVICE_ACCOUNT_mlab_oti
+local keyname=SERVICE_ACCOUNT_mlab_oti
++mktemp
+local keyfile=/tmp/tmp.DYzuVihtiC
+set +x
+gcloud auth activate-service-account --key-file /tmp/tmp.DYzuVihtiC
Activated service account credentials for: [etl-travis-deploy@mlab-oti.iam.gserviceaccount.com]
+rm -f /tmp/tmp.DYzuVihtiC
+gcloud config set core/project mlab-oti
Updated property [core/project].
+gcloud config set core/disable_prompts true
Updated property [core/disable_prompts].
+gcloud config set core/verbosity info
Updated property [core/verbosity].
++gcloud container clusters list '--format=table[no-heading](location)' --filter 'name='\''data-processing'\'''
INFO: Display format: "
    table(
        name,
        zone:label=LOCATION,
        master_version():label=MASTER_VERSION,
        endpoint:label=MASTER_IP,
        nodePools[0].config.machineType,
        currentNodeVersion:label=NODE_VERSION,
        firstof(currentNodeCount,initialNodeCount):label=NUM_NODES,
        status
    )
 table[no-heading](location)"
WARNING: --filter : operator evaluation is changing for consistency across Google APIs.  name=data-processing currently does not match but will match in the near future.  Run `gcloud topic filters` for details.
+LOCATION=us-central1
+[[ -z us-central1 ]]
+zone='.*-[a-z]'
+region='.*[a-z][1-9]'
+[[ us-central1 =~ .*-[a-z] ]]
+gcloud container clusters get-credentials data-processing --zone us-central1
Fetching cluster endpoint and auth data.
kubeconfig entry generated for data-processing.
INFO: Display format: "default"
+export PROJECT
+export CLUSTER
+./apply-cluster.sh
+set -e
+set -u
+USAGE='PROJECT=<projectid> CLUSTER=<cluster> TRAVIS_TAG=<tag> TRAVIS_COMMIT=<commit> ./apply-cluster.sh'
+PROJECT=mlab-oti
+CLUSTER=data-processing
+TRAVIS_TAG=prod-v2.5.0
+BIGQUERY_DATASET=tmp_ndt
+TRAVIS_COMMIT=3bdb0e9923a3598e13196b71f56da760e277b9cf
+CFG=/tmp/data-processing-mlab-oti.yml
+touch /tmp/data-processing-mlab-oti.yml
+pwd
/home/travis/gopath/src/github.com/m-lab/etl
+kexpand expand --ignore-missing-keys k8s/data-processing/deployments/parser.yml k8s/data-processing/persistentvolumes/storage-class.yml k8s/data-processing/services/parser.yml --value GCLOUD_PROJECT=mlab-oti --value RELEASE_TAG=prod-v2.5.0 --value GIT_COMMIT=3bdb0e9923a3598e13196b71f56da760e277b9cf --value BIGQUERY_DATASET=tmp_ndt
+cat /tmp/data-processing-mlab-oti.yml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: etl-parser
  namespace: default
spec:
  replicas: 2
  selector:
    matchLabels:
      # Used to match pre-existing pods that may be affected during updates.
      run: etl-parser
  strategy:
    rollingUpdate:
      maxSurge: 3
      maxUnavailable: 2
    type: RollingUpdate
  # Pod template.
  template:
    metadata:
      labels:
        # Note: run=etl-parser should match a service config with a public IP
        # and port so that it is publicly accessible for prometheus scraping
        run: etl-parser
      annotations:
        # Tell prometheus service discovery to collect metrics from the containers.
        prometheus.io/scrape: 'true'
    spec:
      # When container receives SIGTERM, it begins a new checkpoint. This can
      # take longer than the default grace period of 30s.
      terminationGracePeriodSeconds: 120
      # Place the pod into the Guaranteed QoS by setting equal resource
      # requests and limits for *all* containers in the pod.
      # For more background, see:
      # https://github.com/kubernetes/community/blob/master/contributors/design-proposals/resource-qos.md
      containers:
      - image: gcr.io/mlab-oti/github.com/m-lab/etl:3bdb0e9923a3598e13196b71f56da760e277b9cf
        name: etl-parser
        args: ["--prometheusx.listen-address=:9090",
               "--output=gcs",
               "--service_port=:8080",  # If we move to jsonnet, this could be bound to service-port defined below
               "--max_active=100",
               ]
        env:
        - name: RELEASE_TAG
          value: prod-v2.5.0
        - name: GIT_COMMIT
          value: "3bdb0e9923a3598e13196b71f56da760e277b9cf"
        - name: GCLOUD_PROJECT
          value: "mlab-oti"
        - name: BIGQUERY_DATASET
          value: "tmp_ndt"
        - name: GARDENER_HOST
          value: "etl-gardener-service.default.svc.cluster.local"
        - name: BATCH_SERVICE
          value: 'true'   # Allow instances to discover they are BATCH instances.
        - name: MAX_WORKERS
          value: '10' # Singleton workers, in addition to the active workers.
        - name: NDT_OMIT_DELTAS
          value: 'true'
        ports:
        - name: prometheus-port
          containerPort: 9090
        - name: service-port
          containerPort: 8080
        livenessProbe:
          httpGet:
            path: /alive
            port: service-port
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 4
          successThreshold: 1
          failureThreshold: 3
        readinessProbe:
          httpGet:
            path: /ready
            port: service-port
        resources:
          requests:
            memory: "15Gi"
            cpu: "7"
          limits:
            memory: "20Gi"
            cpu: "7"
      nodeSelector:
        parser-node: "true"
---
apiVersion: storage.k8s.io/v1beta1
kind: StorageClass
metadata:
  name: slow
provisioner: kubernetes.io/gce-pd
parameters:
  type: pd-standard
---
apiVersion: storage.k8s.io/v1beta1
kind: StorageClass
metadata:
  name: fast
provisioner: kubernetes.io/gce-pd
parameters:
  type: pd-ssd
---
apiVersion: v1
kind: Service
metadata:
  name: etl-parser
  namespace: default
  annotations:
    cloud.google.com/load-balancer-type: "Internal"
spec:
  ports:
  - port: 8080
    protocol: TCP
    targetPort: 8080
  selector:
    run: etl-parser
  sessionAffinity: None
  type: LoadBalancer
+kubectl apply -f /tmp/data-processing-mlab-oti.yml
Error from server (Forbidden): error when retrieving current configuration of:
Resource: "extensions/v1beta1, Resource=deployments", GroupVersionKind: "extensions/v1beta1, Kind=Deployment"
Name: "etl-parser", Namespace: "default"
Object: &{map["apiVersion":"extensions/v1beta1" "kind":"Deployment" "metadata":map["annotations":map["kubectl.kubernetes.io/last-applied-configuration":""] "name":"etl-parser" "namespace":"default"] "spec":map["replicas":'\x02' "selector":map["matchLabels":map["run":"etl-parser"]] "strategy":map["rollingUpdate":map["maxSurge":'\x03' "maxUnavailable":'\x02'] "type":"RollingUpdate"] "template":map["metadata":map["annotations":map["prometheus.io/scrape":"true"] "labels":map["run":"etl-parser"]] "spec":map["containers":[map["args":["--prometheusx.listen-address=:9090" "--output=gcs" "--service_port=:8080" "--max_active=100"] "env":[map["name":"RELEASE_TAG" "value":"prod-v2.5.0"] map["name":"GIT_COMMIT" "value":"3bdb0e9923a3598e13196b71f56da760e277b9cf"] map["name":"GCLOUD_PROJECT" "value":"mlab-oti"] map["name":"BIGQUERY_DATASET" "value":"tmp_ndt"] map["name":"GARDENER_HOST" "value":"etl-gardener-service.default.svc.cluster.local"] map["name":"BATCH_SERVICE" "value":"true"] map["name":"MAX_WORKERS" "value":"10"] map["name":"NDT_OMIT_DELTAS" "value":"true"]] "image":"gcr.io/mlab-oti/github.com/m-lab/etl:3bdb0e9923a3598e13196b71f56da760e277b9cf" "livenessProbe":map["failureThreshold":'\x03' "httpGet":map["path":"/alive" "port":"service-port"] "initialDelaySeconds":'\x1e' "periodSeconds":'\n' "successThreshold":'\x01' "timeoutSeconds":'\x04'] "name":"etl-parser" "ports":[map["containerPort":'\u2382' "name":"prometheus-port"] map["containerPort":'\u1f90' "name":"service-port"]] "readinessProbe":map["httpGet":map["path":"/ready" "port":"service-port"]] "resources":map["limits":map["cpu":"7" "memory":"20Gi"] "requests":map["cpu":"7" "memory":"15Gi"]]]] "nodeSelector":map["parser-node":"true"] "terminationGracePeriodSeconds":'x']]]]}
from server for: "/tmp/data-processing-mlab-oti.yml": deployments.extensions "etl-parser" is forbidden: User "etl-travis-deploy@mlab-oti.iam.gserviceaccount.com" cannot get resource "deployments" in API group "extensions" in the namespace "default": requires one of ["container.deployments.get"] permission(s).
Error from server (Forbidden): error when retrieving current configuration of:
Resource: "storage.k8s.io/v1beta1, Resource=storageclasses", GroupVersionKind: "storage.k8s.io/v1beta1, Kind=StorageClass"
Name: "slow", Namespace: ""
Object: &{map["apiVersion":"storage.k8s.io/v1beta1" "kind":"StorageClass" "metadata":map["annotations":map["kubectl.kubernetes.io/last-applied-configuration":""] "name":"slow"] "parameters":map["type":"pd-standard"] "provisioner":"kubernetes.io/gce-pd"]}
from server for: "/tmp/data-processing-mlab-oti.yml": storageclasses.storage.k8s.io "slow" is forbidden: User "etl-travis-deploy@mlab-oti.iam.gserviceaccount.com" cannot get resource "storageclasses" in API group "storage.k8s.io" at the cluster scope: requires one of ["container.storageClasses.get"] permission(s).
Error from server (Forbidden): error when retrieving current configuration of:
Resource: "storage.k8s.io/v1beta1, Resource=storageclasses", GroupVersionKind: "storage.k8s.io/v1beta1, Kind=StorageClass"
Name: "fast", Namespace: ""
Object: &{map["apiVersion":"storage.k8s.io/v1beta1" "kind":"StorageClass" "metadata":map["annotations":map["kubectl.kubernetes.io/last-applied-configuration":""] "name":"fast"] "parameters":map["type":"pd-ssd"] "provisioner":"kubernetes.io/gce-pd"]}
from server for: "/tmp/data-processing-mlab-oti.yml": storageclasses.storage.k8s.io "fast" is forbidden: User "etl-travis-deploy@mlab-oti.iam.gserviceaccount.com" cannot get resource "storageclasses" in API group "storage.k8s.io" at the cluster scope: requires one of ["container.storageClasses.get"] permission(s).
Error from server (Forbidden): error when retrieving current configuration of:
Resource: "/v1, Resource=services", GroupVersionKind: "/v1, Kind=Service"
Name: "etl-parser", Namespace: "default"
Object: &{map["apiVersion":"v1" "kind":"Service" "metadata":map["annotations":map["cloud.google.com/load-balancer-type":"Internal" "kubectl.kubernetes.io/last-applied-configuration":""] "name":"etl-parser" "namespace":"default"] "spec":map["ports":[map["port":'\u1f90' "protocol":"TCP" "targetPort":'\u1f90']] "selector":map["run":"etl-parser"] "sessionAffinity":"None" "type":"LoadBalancer"]]}
from server for: "/tmp/data-processing-mlab-oti.yml": services "etl-parser" is forbidden: User "etl-travis-deploy@mlab-oti.iam.gserviceaccount.com" cannot get resource "services" in API group "" in the namespace "default": requires one of ["container.services.get"] permission(s).
Already up to date!
Not currently on any branch.
Untracked files:
  (use "git add <file>..." to include in what will be committed)
    active.cov
    annotation.cov
    appengine_queue_pusher.cov
    bq.cov
    cmd/etl_worker/etl_worker
    cmd/update-schema/update-schema
    etl.cov
    functions/.gcloudignore
    functions/embargo/.gcloudignore
    functions/embargo/node_modules/
    functions/embargo/package-lock.json
    functions/node_modules/
    functions/package-lock.json
    merge.cov
    metrics.cov
    parser.cov
    parser/testdata/PT/
    parser/testdata/sidestream/
    parser/testdata/web100/
    schema.cov
    task.cov
    travis-testing.key
    web100.cov
    web100/testdata/web100/
nothing added to commit but untracked files present (use "git add" to track)
Dropped refs/stash@{0} (d3bd7524fb01e93391ef7af5e22b21ccb009ba1e)
Script failed with status 1
failed to deploy
gfr10598 commented 4 years ago

manually added role in mlab-oti using console.