Panfactum / stack

The Panfactum Stack
https://panfactum.com
Other
16 stars 5 forks source link

[question]: linkerd-proxy not exiting success even though argo workflow exits 0 #148

Closed wesbragagt closed 1 month ago

wesbragagt commented 1 month ago

Prior Search

What is your question?

I have an Argo workflow cron-job that although passes with exit code 0, I'm noticing the linkerd-proxy side-car is not completing which causes the whole workflow to fail.

  1. Can I opt out of linkerd for workflows?
  2. I've tried a pod annotation for graceful shutdown described here https://github.com/linkerd/linkerd2/issues/8912 and couldn't get it to work.

What primary components of the stack does this relate to?

terraform

Code of Conduct

wesbragagt commented 1 month ago

Yaml for the pod:

Name:                 tct-suite-api-8d6p4
Namespace:            implentio
Priority:             0
Priority Class Name:  default
Service Account:      tct-suite-api-b6a7d322e01b51ec
Node:                 ip-10-0-190-175.us-west-2.compute.internal/10.0.190.175
Start Time:           Tue, 01 Oct 2024 16:25:19 -0500
Labels:               linkerd.io/control-plane-ns=linkerd
                      linkerd.io/workload-ns=implentio
                      workflows.argoproj.io/completed=true
                      workflows.argoproj.io/workflow=tct-suite-api-8d6p4
Annotations:          kubectl.kubernetes.io/default-container: main
                      linkerd.io/created-by: linkerd/proxy-injector edge-24.5.1
                      linkerd.io/inject: enabled
                      linkerd.io/proxy-version: edge-24.5.1
                      linkerd.io/trust-root-sha256: 7ffb0b03cd0909363d64ff5fd8f5b19f37994ffdfad7e6552b73a113bf97711b
                      workflows.argoproj.io/node-id: tct-suite-api-8d6p4
                      workflows.argoproj.io/node-name: tct-suite-api-8d6p4
Status:               Succeeded
IP:                   10.0.176.148
IPs:
  IP:           10.0.176.148
Controlled By:  Workflow/tct-suite-api-8d6p4
Init Containers:
  linkerd-init:
    Container ID:    containerd://c32b93db5b02e543587d0698ff8f76198df909d4499b54585d2b848a8719606e
    Image:           590183845935.dkr.ecr.us-west-2.amazonaws.com/github/linkerd/proxy-init:v2.4.0
    Image ID:        590183845935.dkr.ecr.us-west-2.amazonaws.com/github/linkerd/proxy-init@sha256:5bd804267a4e0b585c5e6e1e1cbf5d91887ed73be84e35fe784df2331b6e9c61
    Port:            <none>
    Host Port:       <none>
    SeccompProfile:  RuntimeDefault
    Args:
      --incoming-proxy-port
      4143
      --outgoing-proxy-port
      4140
      --proxy-uid
      2102
      --inbound-ports-to-ignore
      4190,4191,4567,4568
      --outbound-ports-to-ignore
      4567,4568
      --log-format
      json
      --log-level
      warn
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Tue, 01 Oct 2024 16:25:20 -0500
      Finished:     Tue, 01 Oct 2024 16:25:20 -0500
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     100m
      memory:  10Mi
    Requests:
      cpu:     10m
      memory:  10Mi
    Environment:
      AWS_STS_REGIONAL_ENDPOINTS:   regional
      AWS_DEFAULT_REGION:           us-west-2
      AWS_REGION:                   us-west-2
      AWS_ROLE_ARN:                 arn:aws:iam::590183845935:role/tct-suite-api-b6a7d322e01b51ec-20240920213059807400000003
      AWS_WEB_IDENTITY_TOKEN_FILE:  /var/run/secrets/eks.amazonaws.com/serviceaccount/token
    Mounts:
      /run from linkerd-proxy-init-xtables-lock (rw)
      /var/run/secrets/eks.amazonaws.com/serviceaccount from aws-iam-token (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-hs877 (ro)
  linkerd-proxy:
    Container ID:    containerd://edae765e8bc800ada057742b28bce3a13ad11318bb650f534f453635dbaf2468
    Image:           590183845935.dkr.ecr.us-west-2.amazonaws.com/github/linkerd/proxy:edge-24.5.1
    Image ID:        590183845935.dkr.ecr.us-west-2.amazonaws.com/github/linkerd/proxy@sha256:6ecc3ede913be8014a3f93c34bf6a2e6fbd1f4009f3d39d134b925d609529402
    Ports:           4143/TCP, 4191/TCP
    Host Ports:      0/TCP, 0/TCP
    SeccompProfile:  RuntimeDefault
    State:           Terminated
      Reason:        Error
      Exit Code:     137
      Started:       Tue, 01 Oct 2024 16:25:20 -0500
      Finished:      Tue, 01 Oct 2024 16:28:48 -0500
    Ready:           False
    Restart Count:   0
    Limits:
      memory:  200Mi
    Requests:
      memory:   10Mi
    Liveness:   http-get http://:4191/live delay=10s timeout=1s period=10s #success=1 #failure=3
    Readiness:  http-get http://:4191/ready delay=2s timeout=1s period=10s #success=1 #failure=3
    Startup:    http-get http://:4191/ready delay=0s timeout=1s period=1s #success=1 #failure=120
    Environment:
      _pod_name:                                                 tct-suite-api-8d6p4 (v1:metadata.name)
      _pod_ns:                                                   implentio (v1:metadata.namespace)
      _pod_nodeName:                                              (v1:spec.nodeName)
      LINKERD2_PROXY_LOG:                                        warn,linkerd=warn,linkerd2_proxy=warn
      LINKERD2_PROXY_LOG_FORMAT:                                 json
      LINKERD2_PROXY_DESTINATION_SVC_ADDR:                       linkerd-dst-headless.linkerd.svc.cluster.local.:8086
      LINKERD2_PROXY_DESTINATION_PROFILE_NETWORKS:               10.0.0.0/8,100.64.0.0/10,172.16.0.0/12,192.168.0.0/16,fd00::/8
      LINKERD2_PROXY_POLICY_SVC_ADDR:                            linkerd-policy.linkerd.svc.cluster.local.:8090
      LINKERD2_PROXY_POLICY_WORKLOAD:                            {"ns":"$(_pod_ns)", "pod":"$(_pod_name)"}

      LINKERD2_PROXY_INBOUND_DEFAULT_POLICY:                     all-unauthenticated
      LINKERD2_PROXY_POLICY_CLUSTER_NETWORKS:                    10.0.0.0/8,100.64.0.0/10,172.16.0.0/12,192.168.0.0/16,fd00::/8
      LINKERD2_PROXY_CONTROL_STREAM_INITIAL_TIMEOUT:             3s
      LINKERD2_PROXY_CONTROL_STREAM_IDLE_TIMEOUT:                5m
      LINKERD2_PROXY_CONTROL_STREAM_LIFETIME:                    1h
      LINKERD2_PROXY_INBOUND_CONNECT_TIMEOUT:                    100ms
      LINKERD2_PROXY_OUTBOUND_CONNECT_TIMEOUT:                   1000ms
      LINKERD2_PROXY_OUTBOUND_DISCOVERY_IDLE_TIMEOUT:            5s
      LINKERD2_PROXY_INBOUND_DISCOVERY_IDLE_TIMEOUT:             90s
      LINKERD2_PROXY_CONTROL_LISTEN_ADDR:                        [::]:4190
      LINKERD2_PROXY_ADMIN_LISTEN_ADDR:                          [::]:4191
      LINKERD2_PROXY_OUTBOUND_LISTEN_ADDR:                       127.0.0.1:4140
      LINKERD2_PROXY_OUTBOUND_LISTEN_ADDRS:                      127.0.0.1:4140,[::1]:4140
      LINKERD2_PROXY_INBOUND_LISTEN_ADDR:                        [::]:4143
      LINKERD2_PROXY_INBOUND_IPS:                                 (v1:status.podIPs)
      LINKERD2_PROXY_INBOUND_PORTS:                              
      LINKERD2_PROXY_DESTINATION_PROFILE_SUFFIXES:               svc.cluster.local.
      LINKERD2_PROXY_INBOUND_ACCEPT_KEEPALIVE:                   10000ms
      LINKERD2_PROXY_OUTBOUND_CONNECT_KEEPALIVE:                 10000ms
      LINKERD2_PROXY_INBOUND_SERVER_HTTP2_KEEP_ALIVE_INTERVAL:   10s
      LINKERD2_PROXY_INBOUND_SERVER_HTTP2_KEEP_ALIVE_TIMEOUT:    3s
      LINKERD2_PROXY_OUTBOUND_SERVER_HTTP2_KEEP_ALIVE_INTERVAL:  10s
      LINKERD2_PROXY_OUTBOUND_SERVER_HTTP2_KEEP_ALIVE_TIMEOUT:   3s
      LINKERD2_PROXY_INBOUND_PORTS_DISABLE_PROTOCOL_DETECTION:   25,587,3306,4444,5432,6379,9300,11211
      LINKERD2_PROXY_DESTINATION_CONTEXT:                        {"ns":"$(_pod_ns)", "nodeName":"$(_pod_nodeName)", "pod":"$(_pod_name)"}

      _pod_sa:                                                    (v1:spec.serviceAccountName)
      _l5d_ns:                                                   linkerd
      _l5d_trustdomain:                                          cluster.local
      LINKERD2_PROXY_IDENTITY_DIR:                               /var/run/linkerd/identity/end-entity
      LINKERD2_PROXY_IDENTITY_TRUST_ANCHORS:                     -----BEGIN CERTIFICATE-----
                                                                 MIIC2TCCAn+gAwIBAgIUJ/mP5KMLYTB3q4fS7ZODFDhtC/4wCgYIKoZIzj0EAwIw
                                                                 ZTESMBAGA1UEChMJcGFuZmFjdHVtMRQwEgYDVQQLEwtlbmdpbmVlcmluZzE5MDcG
                                                                 A1UEAxMwaHR0cDovL3ZhdWx0LWFjdGl2ZS52YXVsdC5zdmMuY2x1c3Rlci5sb2Nh
                                                                 bDo4MjAwMB4XDTI0MDYyMDIwMzYwMFoXDTM0MDYxODIwMzYzMFowZTESMBAGA1UE
                                                                 ChMJcGFuZmFjdHVtMRQwEgYDVQQLEwtlbmdpbmVlcmluZzE5MDcGA1UEAxMwaHR0
                                                                 cDovL3ZhdWx0LWFjdGl2ZS52YXVsdC5zdmMuY2x1c3Rlci5sb2NhbDo4MjAwMFkw
                                                                 EwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAE/aK5bxPHsw1RpK2PYtNSXp2E4eSIetEE
                                                                 Ad4hgXiGeWnwW2UF1FsHA3mbO1n8oCXh5JeA28BrB0XqYHCepwA+cqOCAQswggEH
                                                                 MA4GA1UdDwEB/wQEAwIBBjAPBgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBR2dGag
                                                                 Sinb//5+toDC27lqur4C2jAfBgNVHSMEGDAWgBR2dGagSinb//5+toDC27lqur4C
                                                                 2jBWBggrBgEFBQcBAQRKMEgwRgYIKwYBBQUHMAKGOmh0dHA6Ly92YXVsdC1hY3Rp
                                                                 dmUudmF1bHQuc3ZjLmNsdXN0ZXIubG9jYWw6ODIwMC92MS9wa2kvY2EwTAYDVR0f
                                                                 BEUwQzBBoD+gPYY7aHR0cDovL3ZhdWx0LWFjdGl2ZS52YXVsdC5zdmMuY2x1c3Rl
                                                                 ci5sb2NhbDo4MjAwL3YxL3BraS9jcmwwCgYIKoZIzj0EAwIDSAAwRQIgapJxAslc
                                                                 qgsVWb4k1dDzDzaiT5XPTAOqO9iFawc1kCcCIQDETlt99S2A/UaP6H7SSBulPMRX
                                                                 UEXOrR9AWkahiJeUQw==
                                                                 -----END CERTIFICATE-----

      LINKERD2_PROXY_IDENTITY_TOKEN_FILE:                        /var/run/secrets/tokens/linkerd-identity-token
      LINKERD2_PROXY_IDENTITY_SVC_ADDR:                          linkerd-identity-headless.linkerd.svc.cluster.local.:8080
      LINKERD2_PROXY_IDENTITY_LOCAL_NAME:                        $(_pod_sa).$(_pod_ns).serviceaccount.identity.linkerd.cluster.local
      LINKERD2_PROXY_IDENTITY_SVC_NAME:                          linkerd-identity.linkerd.serviceaccount.identity.linkerd.cluster.local
      LINKERD2_PROXY_DESTINATION_SVC_NAME:                       linkerd-destination.linkerd.serviceaccount.identity.linkerd.cluster.local
      LINKERD2_PROXY_POLICY_SVC_NAME:                            linkerd-destination.linkerd.serviceaccount.identity.linkerd.cluster.local
      AWS_STS_REGIONAL_ENDPOINTS:                                regional
      AWS_DEFAULT_REGION:                                        us-west-2
      AWS_REGION:                                                us-west-2
      AWS_ROLE_ARN:                                              arn:aws:iam::590183845935:role/tct-suite-api-b6a7d322e01b51ec-20240920213059807400000003
      AWS_WEB_IDENTITY_TOKEN_FILE:                               /var/run/secrets/eks.amazonaws.com/serviceaccount/token
    Mounts:
      /var/run/linkerd/identity/end-entity from linkerd-identity-end-entity (rw)
      /var/run/secrets/eks.amazonaws.com/serviceaccount from aws-iam-token (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-hs877 (ro)
      /var/run/secrets/tokens from linkerd-identity-token (rw)
  init:
    Container ID:  containerd://6f6373a057b2bf2b7fbf9e0e0c91ae6422fd190d4730248bdf00b22ca5781c47
    Image:         590183845935.dkr.ecr.us-west-2.amazonaws.com/quay/argoproj/argoexec:v3.5.5
    Image ID:      590183845935.dkr.ecr.us-west-2.amazonaws.com/quay/argoproj/argoexec@sha256:32a568bd1ecb2691a61aa4a646d90b08fe5c4606a2d5cbf264565b1ced98f12b
    Port:          <none>
    Host Port:     <none>
    Command:
      argoexec
      init
      --loglevel
      info
      --log-format
      json
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Tue, 01 Oct 2024 16:25:23 -0500
      Finished:     Tue, 01 Oct 2024 16:25:24 -0500
    Ready:          True
    Restart Count:  0
    Limits:
      memory:  70Mi
    Requests:
      cpu:     10m
      memory:  50Mi
    Environment:
      ARGO_POD_NAME:                      tct-suite-api-8d6p4 (v1:metadata.name)
      ARGO_POD_UID:                        (v1:metadata.uid)
      GODEBUG:                            x509ignoreCN=0
      ARGO_WORKFLOW_NAME:                 tct-suite-api-8d6p4
      ARGO_WORKFLOW_UID:                  d05ab2bc-845f-4d95-9833-19c600cada35
      ARGO_CONTAINER_NAME:                init
      ARGO_TEMPLATE:                      {"name":"entry","inputs":{"parameters":[{"name":"image_version","default":"b818506f1be43f4b0879476b15ab01430ed82081","value":"b818506f1be43f4b0879476b15ab01430ed82081","description":""},{"name":"brand_slug","default":"tct","value":"tct","enum":["tct"],"description":""},{"name":"s3_bucket","default":"snowflake-stage-590183845935","value":"snowflake-stage-590183845935","description":""},{"name":"start_date","default":"","value":"","description":"Format: YYYY-MM-DD"},{"name":"end_date","default":"","value":"","description":"Format: YYYY-MM-DD"}]},"outputs":{},"affinity":{"nodeAffinity":{}},"metadata":{},"container":{"name":"","image":"730335560480.dkr.ecr.us-west-2.amazonaws.com/suite-api:b818506f1be43f4b0879476b15ab01430ed82081","command":["python","/app/netsuiteData.py","tct","snowflake-stage-590183845935","-s ","-e "],"env":[{"name":"POD_IP","valueFrom":{"fieldRef":{"apiVersion":"v1","fieldPath":"status.podIP"}}},{"name":"POD_NAME","valueFrom":{"fieldRef":{"apiVersion":"v1","fieldPath":"metadata.name"}}},{"name":"POD_NAMESPACE","valueFrom":{"fieldRef":{"apiVersion":"v1","fieldPath":"metadata.namespace"}}},{"name":"NAMESPACE","valueFrom":{"fieldRef":{"apiVersion":"v1","fieldPath":"metadata.namespace"}}},{"name":"POD_SERVICE_ACCOUNT","valueFrom":{"fieldRef":{"apiVersion":"v1","fieldPath":"spec.serviceAccountName"}}},{"name":"NODE_NAME","valueFrom":{"fieldRef":{"apiVersion":"v1","fieldPath":"spec.nodeName"}}},{"name":"NODE_IP","valueFrom":{"fieldRef":{"apiVersion":"v1","fieldPath":"status.hostIP"}}},{"name":"CONTAINER_CPU_REQUEST","valueFrom":{"resourceFieldRef":{"resource":"requests.cpu","divisor":"0"}}},{"name":"CONTAINER_MEMORY_REQUEST","valueFrom":{"resourceFieldRef":{"resource":"requests.memory","divisor":"0"}}},{"name":"CONTAINER_MEMORY_LIMIT","valueFrom":{"resourceFieldRef":{"resource":"limits.memory","divisor":"0"}}},{"name":"CONTAINER_EPHEMERAL_STORAGE_REQUEST","valueFrom":{"resourceFieldRef":{"resource":"requests.ephemeral-storage","divisor":"0"}}},{"name":"CONTAINER_EPHEMERAL_STORAGE_LIMIT","valueFrom":{"resourceFieldRef":{"resource":"limits.ephemeral-storage","divisor":"0"}}},{"name":"AWS_ACCOUNT_ID","value":"590183845935"},{"name":"AWS_REGION","value":"us-west-2"},{"name":"BASE_URL","value":"https://6466422.restlets.api.netsuite.com/app/site/hosting/restlet.nl?script=1423\u0026deploy=1"},{"name":"DB_HOST","value":"pg-88cc-pooler-rw.implentio"},{"name":"DB_NAME","value":"app"},{"name":"DB_PORT","value":"5432"},{"name":"IMPLENTIO_API_BASE_URL","value":"https://api.prod.implentio.net"},{"name":"REALM","value":"6466422"},{"name":"SNOWFLAKE_ACCOUNT","value":"izb31483.prod3.us-west-2.aws"},{"name":"SNOWFLAKE_DATABASE","value":"IMPLENTIO_RAW"},{"name":"SNOWFLAKE_SCHEMA","value":"PUBLIC"},{"name":"SNOWFLAKE_WAREHOUSE","value":"WH_APPLICATION_XSM"},{"name":"CONSUMER_KEY","valueFrom":{"secretKeyRef":{"name":"tct-suite-api-b6a7d322e01b51ec","key":"CONSUMER_KEY","optional":false}}},{"name":"CONSUMER_SECRET","valueFrom":{"secretKeyRef":{"name":"tct-suite-api-b6a7d322e01b51ec","key":"CONSUMER_SECRET","optional":false}}},{"name":"DB_PASSWORD","valueFrom":{"secretKeyRef":{"name":"tct-suite-api-b6a7d322e01b51ec","key":"DB_PASSWORD","optional":false}}},{"name":"DB_USER","valueFrom":{"secretKeyRef":{"name":"tct-suite-api-b6a7d322e01b51ec","key":"DB_USER","optional":false}}},{"name":"IMPLENTIO_API_KEY","valueFrom":{"secretKeyRef":{"name":"tct-suite-api-b6a7d322e01b51ec","key":"IMPLENTIO_API_KEY","optional":false}}},{"name":"OAUTH_TOKEN","valueFrom":{"secretKeyRef":{"name":"tct-suite-api-b6a7d322e01b51ec","key":"OAUTH_TOKEN","optional":false}}},{"name":"OAUTH_TOKEN_SECRET","valueFrom":{"secretKeyRef":{"name":"tct-suite-api-b6a7d322e01b51ec","key":"OAUTH_TOKEN_SECRET","optional":false}}},{"name":"SNOWFLAKE_PASSWORD","valueFrom":{"secretKeyRef":{"name":"tct-suite-api-b6a7d322e01b51ec","key":"SNOWFLAKE_PASSWORD","optional":false}}},{"name":"SNOWFLAKE_USER","valueFrom":{"secretKeyRef":{"name":"tct-suite-api-b6a7d322e01b51ec","key":"SNOWFLAKE_USER","optional":false}}}],"resources":{"limits":{"memory":"10Gi"},"requests":{"cpu":"250m","memory":"300Mi"}},"volumeMounts":[{"name":"podinfo","mountPath":"/etc/podinfo"}],"securityContext":{"capabilities":{"drop":["ALL"]},"privileged":false,"runAsUser":1000,"runAsGroup":1000,"runAsNonRoot":true,"readOnlyRootFilesystem":true,"allowPrivilegeEscalation":false}},"archiveLocation":{"archiveLogs":true,"s3":{"endpoint":"s3.amazonaws.com","bucket":"argo-5f172fe57e0cb410","region":"us-west-2","key":"tct-suite-api-8d6p4/tct-suite-api-8d6p4"}},"tolerations":[{"key":"spot","operator":"Equal","value":"true","effect":"NoSchedule"},{"key":"burstable","operator":"Equal","value":"true","effect":"NoSchedule"},{"key":"arm64","operator":"Equal","value":"true","effect":"NoSchedule"}],"schedulerName":"panfactum","serviceAccountName":"tct-suite-api-b6a7d322e01b51ec"}
      ARGO_NODE_ID:                       tct-suite-api-8d6p4
      ARGO_INCLUDE_SCRIPT_OUTPUT:         false
      ARGO_DEADLINE:                      0001-01-01T00:00:00Z
      ARGO_PROGRESS_FILE:                 /var/run/argo/progress
      ARGO_PROGRESS_PATCH_TICK_DURATION:  1m0s
      ARGO_PROGRESS_FILE_TICK_DURATION:   3s
      AWS_STS_REGIONAL_ENDPOINTS:         regional
      AWS_DEFAULT_REGION:                 us-west-2
      AWS_REGION:                         us-west-2
      AWS_ROLE_ARN:                       arn:aws:iam::590183845935:role/tct-suite-api-b6a7d322e01b51ec-20240920213059807400000003
      AWS_WEB_IDENTITY_TOKEN_FILE:        /var/run/secrets/eks.amazonaws.com/serviceaccount/token
    Mounts:
      /var/run/argo from var-run-argo (rw)
      /var/run/secrets/eks.amazonaws.com/serviceaccount from aws-iam-token (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-hs877 (ro)
Containers:
  wait:
    Container ID:  containerd://534d7e8a765e55c31806a6728677da48c2977afa94d60a68d4df54c00a517d42
    Image:         590183845935.dkr.ecr.us-west-2.amazonaws.com/quay/argoproj/argoexec:v3.5.5
    Image ID:      590183845935.dkr.ecr.us-west-2.amazonaws.com/quay/argoproj/argoexec@sha256:32a568bd1ecb2691a61aa4a646d90b08fe5c4606a2d5cbf264565b1ced98f12b
    Port:          <none>
    Host Port:     <none>
    Command:
      argoexec
      wait
      --loglevel
      info
      --log-format
      json
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Tue, 01 Oct 2024 16:25:25 -0500
      Finished:     Tue, 01 Oct 2024 16:28:17 -0500
    Ready:          False
    Restart Count:  0
    Limits:
      memory:  70Mi
    Requests:
      cpu:     10m
      memory:  50Mi
    Environment:
      ARGO_POD_NAME:                      tct-suite-api-8d6p4 (v1:metadata.name)
      ARGO_POD_UID:                        (v1:metadata.uid)
      GODEBUG:                            x509ignoreCN=0
      ARGO_WORKFLOW_NAME:                 tct-suite-api-8d6p4
      ARGO_WORKFLOW_UID:                  d05ab2bc-845f-4d95-9833-19c600cada35
      ARGO_CONTAINER_NAME:                wait
      ARGO_TEMPLATE:                      {"name":"entry","inputs":{"parameters":[{"name":"image_version","default":"b818506f1be43f4b0879476b15ab01430ed82081","value":"b818506f1be43f4b0879476b15ab01430ed82081","description":""},{"name":"brand_slug","default":"tct","value":"tct","enum":["tct"],"description":""},{"name":"s3_bucket","default":"snowflake-stage-590183845935","value":"snowflake-stage-590183845935","description":""},{"name":"start_date","default":"","value":"","description":"Format: YYYY-MM-DD"},{"name":"end_date","default":"","value":"","description":"Format: YYYY-MM-DD"}]},"outputs":{},"affinity":{"nodeAffinity":{}},"metadata":{},"container":{"name":"","image":"730335560480.dkr.ecr.us-west-2.amazonaws.com/suite-api:b818506f1be43f4b0879476b15ab01430ed82081","command":["python","/app/netsuiteData.py","tct","snowflake-stage-590183845935","-s ","-e "],"env":[{"name":"POD_IP","valueFrom":{"fieldRef":{"apiVersion":"v1","fieldPath":"status.podIP"}}},{"name":"POD_NAME","valueFrom":{"fieldRef":{"apiVersion":"v1","fieldPath":"metadata.name"}}},{"name":"POD_NAMESPACE","valueFrom":{"fieldRef":{"apiVersion":"v1","fieldPath":"metadata.namespace"}}},{"name":"NAMESPACE","valueFrom":{"fieldRef":{"apiVersion":"v1","fieldPath":"metadata.namespace"}}},{"name":"POD_SERVICE_ACCOUNT","valueFrom":{"fieldRef":{"apiVersion":"v1","fieldPath":"spec.serviceAccountName"}}},{"name":"NODE_NAME","valueFrom":{"fieldRef":{"apiVersion":"v1","fieldPath":"spec.nodeName"}}},{"name":"NODE_IP","valueFrom":{"fieldRef":{"apiVersion":"v1","fieldPath":"status.hostIP"}}},{"name":"CONTAINER_CPU_REQUEST","valueFrom":{"resourceFieldRef":{"resource":"requests.cpu","divisor":"0"}}},{"name":"CONTAINER_MEMORY_REQUEST","valueFrom":{"resourceFieldRef":{"resource":"requests.memory","divisor":"0"}}},{"name":"CONTAINER_MEMORY_LIMIT","valueFrom":{"resourceFieldRef":{"resource":"limits.memory","divisor":"0"}}},{"name":"CONTAINER_EPHEMERAL_STORAGE_REQUEST","valueFrom":{"resourceFieldRef":{"resource":"requests.ephemeral-storage","divisor":"0"}}},{"name":"CONTAINER_EPHEMERAL_STORAGE_LIMIT","valueFrom":{"resourceFieldRef":{"resource":"limits.ephemeral-storage","divisor":"0"}}},{"name":"AWS_ACCOUNT_ID","value":"590183845935"},{"name":"AWS_REGION","value":"us-west-2"},{"name":"BASE_URL","value":"https://6466422.restlets.api.netsuite.com/app/site/hosting/restlet.nl?script=1423\u0026deploy=1"},{"name":"DB_HOST","value":"pg-88cc-pooler-rw.implentio"},{"name":"DB_NAME","value":"app"},{"name":"DB_PORT","value":"5432"},{"name":"IMPLENTIO_API_BASE_URL","value":"https://api.prod.implentio.net"},{"name":"REALM","value":"6466422"},{"name":"SNOWFLAKE_ACCOUNT","value":"izb31483.prod3.us-west-2.aws"},{"name":"SNOWFLAKE_DATABASE","value":"IMPLENTIO_RAW"},{"name":"SNOWFLAKE_SCHEMA","value":"PUBLIC"},{"name":"SNOWFLAKE_WAREHOUSE","value":"WH_APPLICATION_XSM"},{"name":"CONSUMER_KEY","valueFrom":{"secretKeyRef":{"name":"tct-suite-api-b6a7d322e01b51ec","key":"CONSUMER_KEY","optional":false}}},{"name":"CONSUMER_SECRET","valueFrom":{"secretKeyRef":{"name":"tct-suite-api-b6a7d322e01b51ec","key":"CONSUMER_SECRET","optional":false}}},{"name":"DB_PASSWORD","valueFrom":{"secretKeyRef":{"name":"tct-suite-api-b6a7d322e01b51ec","key":"DB_PASSWORD","optional":false}}},{"name":"DB_USER","valueFrom":{"secretKeyRef":{"name":"tct-suite-api-b6a7d322e01b51ec","key":"DB_USER","optional":false}}},{"name":"IMPLENTIO_API_KEY","valueFrom":{"secretKeyRef":{"name":"tct-suite-api-b6a7d322e01b51ec","key":"IMPLENTIO_API_KEY","optional":false}}},{"name":"OAUTH_TOKEN","valueFrom":{"secretKeyRef":{"name":"tct-suite-api-b6a7d322e01b51ec","key":"OAUTH_TOKEN","optional":false}}},{"name":"OAUTH_TOKEN_SECRET","valueFrom":{"secretKeyRef":{"name":"tct-suite-api-b6a7d322e01b51ec","key":"OAUTH_TOKEN_SECRET","optional":false}}},{"name":"SNOWFLAKE_PASSWORD","valueFrom":{"secretKeyRef":{"name":"tct-suite-api-b6a7d322e01b51ec","key":"SNOWFLAKE_PASSWORD","optional":false}}},{"name":"SNOWFLAKE_USER","valueFrom":{"secretKeyRef":{"name":"tct-suite-api-b6a7d322e01b51ec","key":"SNOWFLAKE_USER","optional":false}}}],"resources":{"limits":{"memory":"10Gi"},"requests":{"cpu":"250m","memory":"300Mi"}},"volumeMounts":[{"name":"podinfo","mountPath":"/etc/podinfo"}],"securityContext":{"capabilities":{"drop":["ALL"]},"privileged":false,"runAsUser":1000,"runAsGroup":1000,"runAsNonRoot":true,"readOnlyRootFilesystem":true,"allowPrivilegeEscalation":false}},"archiveLocation":{"archiveLogs":true,"s3":{"endpoint":"s3.amazonaws.com","bucket":"argo-5f172fe57e0cb410","region":"us-west-2","key":"tct-suite-api-8d6p4/tct-suite-api-8d6p4"}},"tolerations":[{"key":"spot","operator":"Equal","value":"true","effect":"NoSchedule"},{"key":"burstable","operator":"Equal","value":"true","effect":"NoSchedule"},{"key":"arm64","operator":"Equal","value":"true","effect":"NoSchedule"}],"schedulerName":"panfactum","serviceAccountName":"tct-suite-api-b6a7d322e01b51ec"}
      ARGO_NODE_ID:                       tct-suite-api-8d6p4
      ARGO_INCLUDE_SCRIPT_OUTPUT:         false
      ARGO_DEADLINE:                      0001-01-01T00:00:00Z
      ARGO_PROGRESS_FILE:                 /var/run/argo/progress
      ARGO_PROGRESS_PATCH_TICK_DURATION:  1m0s
      ARGO_PROGRESS_FILE_TICK_DURATION:   3s
      AWS_STS_REGIONAL_ENDPOINTS:         regional
      AWS_DEFAULT_REGION:                 us-west-2
      AWS_REGION:                         us-west-2
      AWS_ROLE_ARN:                       arn:aws:iam::590183845935:role/tct-suite-api-b6a7d322e01b51ec-20240920213059807400000003
      AWS_WEB_IDENTITY_TOKEN_FILE:        /var/run/secrets/eks.amazonaws.com/serviceaccount/token
    Mounts:
      /mainctrfs/etc/podinfo from podinfo (rw)
      /tmp from tmp-dir-argo (rw,path="0")
      /var/run/argo from var-run-argo (rw)
      /var/run/secrets/eks.amazonaws.com/serviceaccount from aws-iam-token (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-hs877 (ro)
  main:
    Container ID:  containerd://b6696c492a483cd5178e47e7f472bd0bef567567b0ccbd465a3591795326b8f6
    Image:         730335560480.dkr.ecr.us-west-2.amazonaws.com/suite-api:b818506f1be43f4b0879476b15ab01430ed82081
    Image ID:      730335560480.dkr.ecr.us-west-2.amazonaws.com/suite-api@sha256:bc3d9c2957b930ac62080a466c02c3665e6d9e5951d8cb16b6b10556a15e8121
    Port:          <none>
    Host Port:     <none>
    Command:
      /var/run/argo/argoexec
      emissary
      --loglevel
      info
      --log-format
      json
      --
      python
      /app/netsuiteData.py
      tct
      snowflake-stage-590183845935
      -s 
      -e 
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Tue, 01 Oct 2024 16:25:25 -0500
      Finished:     Tue, 01 Oct 2024 16:28:16 -0500
    Ready:          False
    Restart Count:  0
    Limits:
      memory:  10Gi
    Requests:
      cpu:     250m
      memory:  300Mi
    Environment:
      POD_IP:                                (v1:status.podIP)
      POD_NAME:                             tct-suite-api-8d6p4 (v1:metadata.name)
      POD_NAMESPACE:                        implentio (v1:metadata.namespace)
      NAMESPACE:                            implentio (v1:metadata.namespace)
      POD_SERVICE_ACCOUNT:                   (v1:spec.serviceAccountName)
      NODE_NAME:                             (v1:spec.nodeName)
      NODE_IP:                               (v1:status.hostIP)
      CONTAINER_CPU_REQUEST:                1 (requests.cpu)
      CONTAINER_MEMORY_REQUEST:             314572800 (requests.memory)
      CONTAINER_MEMORY_LIMIT:               10737418240 (limits.memory)
      CONTAINER_EPHEMERAL_STORAGE_REQUEST:  0 (requests.ephemeral-storage)
      CONTAINER_EPHEMERAL_STORAGE_LIMIT:    0 (limits.ephemeral-storage)
      AWS_ACCOUNT_ID:                       590183845935
      AWS_REGION:                           us-west-2
      BASE_URL:                             https://6466422.restlets.api.netsuite.com/app/site/hosting/restlet.nl?script=1423&deploy=1
      DB_HOST:                              pg-88cc-pooler-rw.implentio
      DB_NAME:                              app
      DB_PORT:                              5432
      IMPLENTIO_API_BASE_URL:               https://api.prod.implentio.net
      REALM:                                6466422
      SNOWFLAKE_ACCOUNT:                    izb31483.prod3.us-west-2.aws
      SNOWFLAKE_DATABASE:                   IMPLENTIO_RAW
      SNOWFLAKE_SCHEMA:                     PUBLIC
      SNOWFLAKE_WAREHOUSE:                  WH_APPLICATION_XSM
      CONSUMER_KEY:                         <set to the key 'CONSUMER_KEY' in secret 'tct-suite-api-b6a7d322e01b51ec'>        Optional: false
      CONSUMER_SECRET:                      <set to the key 'CONSUMER_SECRET' in secret 'tct-suite-api-b6a7d322e01b51ec'>     Optional: false
      DB_PASSWORD:                          <set to the key 'DB_PASSWORD' in secret 'tct-suite-api-b6a7d322e01b51ec'>         Optional: false
      DB_USER:                              <set to the key 'DB_USER' in secret 'tct-suite-api-b6a7d322e01b51ec'>             Optional: false
      IMPLENTIO_API_KEY:                    <set to the key 'IMPLENTIO_API_KEY' in secret 'tct-suite-api-b6a7d322e01b51ec'>   Optional: false
      OAUTH_TOKEN:                          <set to the key 'OAUTH_TOKEN' in secret 'tct-suite-api-b6a7d322e01b51ec'>         Optional: false
      OAUTH_TOKEN_SECRET:                   <set to the key 'OAUTH_TOKEN_SECRET' in secret 'tct-suite-api-b6a7d322e01b51ec'>  Optional: false
      SNOWFLAKE_PASSWORD:                   <set to the key 'SNOWFLAKE_PASSWORD' in secret 'tct-suite-api-b6a7d322e01b51ec'>  Optional: false
      SNOWFLAKE_USER:                       <set to the key 'SNOWFLAKE_USER' in secret 'tct-suite-api-b6a7d322e01b51ec'>      Optional: false
      ARGO_CONTAINER_NAME:                  main
      ARGO_TEMPLATE:                        {"name":"entry","inputs":{"parameters":[{"name":"image_version","default":"b818506f1be43f4b0879476b15ab01430ed82081","value":"b818506f1be43f4b0879476b15ab01430ed82081","description":""},{"name":"brand_slug","default":"tct","value":"tct","enum":["tct"],"description":""},{"name":"s3_bucket","default":"snowflake-stage-590183845935","value":"snowflake-stage-590183845935","description":""},{"name":"start_date","default":"","value":"","description":"Format: YYYY-MM-DD"},{"name":"end_date","default":"","value":"","description":"Format: YYYY-MM-DD"}]},"outputs":{},"affinity":{"nodeAffinity":{}},"metadata":{},"container":{"name":"","image":"730335560480.dkr.ecr.us-west-2.amazonaws.com/suite-api:b818506f1be43f4b0879476b15ab01430ed82081","command":["python","/app/netsuiteData.py","tct","snowflake-stage-590183845935","-s ","-e "],"env":[{"name":"POD_IP","valueFrom":{"fieldRef":{"apiVersion":"v1","fieldPath":"status.podIP"}}},{"name":"POD_NAME","valueFrom":{"fieldRef":{"apiVersion":"v1","fieldPath":"metadata.name"}}},{"name":"POD_NAMESPACE","valueFrom":{"fieldRef":{"apiVersion":"v1","fieldPath":"metadata.namespace"}}},{"name":"NAMESPACE","valueFrom":{"fieldRef":{"apiVersion":"v1","fieldPath":"metadata.namespace"}}},{"name":"POD_SERVICE_ACCOUNT","valueFrom":{"fieldRef":{"apiVersion":"v1","fieldPath":"spec.serviceAccountName"}}},{"name":"NODE_NAME","valueFrom":{"fieldRef":{"apiVersion":"v1","fieldPath":"spec.nodeName"}}},{"name":"NODE_IP","valueFrom":{"fieldRef":{"apiVersion":"v1","fieldPath":"status.hostIP"}}},{"name":"CONTAINER_CPU_REQUEST","valueFrom":{"resourceFieldRef":{"resource":"requests.cpu","divisor":"0"}}},{"name":"CONTAINER_MEMORY_REQUEST","valueFrom":{"resourceFieldRef":{"resource":"requests.memory","divisor":"0"}}},{"name":"CONTAINER_MEMORY_LIMIT","valueFrom":{"resourceFieldRef":{"resource":"limits.memory","divisor":"0"}}},{"name":"CONTAINER_EPHEMERAL_STORAGE_REQUEST","valueFrom":{"resourceFieldRef":{"resource":"requests.ephemeral-storage","divisor":"0"}}},{"name":"CONTAINER_EPHEMERAL_STORAGE_LIMIT","valueFrom":{"resourceFieldRef":{"resource":"limits.ephemeral-storage","divisor":"0"}}},{"name":"AWS_ACCOUNT_ID","value":"590183845935"},{"name":"AWS_REGION","value":"us-west-2"},{"name":"BASE_URL","value":"https://6466422.restlets.api.netsuite.com/app/site/hosting/restlet.nl?script=1423\u0026deploy=1"},{"name":"DB_HOST","value":"pg-88cc-pooler-rw.implentio"},{"name":"DB_NAME","value":"app"},{"name":"DB_PORT","value":"5432"},{"name":"IMPLENTIO_API_BASE_URL","value":"https://api.prod.implentio.net"},{"name":"REALM","value":"6466422"},{"name":"SNOWFLAKE_ACCOUNT","value":"izb31483.prod3.us-west-2.aws"},{"name":"SNOWFLAKE_DATABASE","value":"IMPLENTIO_RAW"},{"name":"SNOWFLAKE_SCHEMA","value":"PUBLIC"},{"name":"SNOWFLAKE_WAREHOUSE","value":"WH_APPLICATION_XSM"},{"name":"CONSUMER_KEY","valueFrom":{"secretKeyRef":{"name":"tct-suite-api-b6a7d322e01b51ec","key":"CONSUMER_KEY","optional":false}}},{"name":"CONSUMER_SECRET","valueFrom":{"secretKeyRef":{"name":"tct-suite-api-b6a7d322e01b51ec","key":"CONSUMER_SECRET","optional":false}}},{"name":"DB_PASSWORD","valueFrom":{"secretKeyRef":{"name":"tct-suite-api-b6a7d322e01b51ec","key":"DB_PASSWORD","optional":false}}},{"name":"DB_USER","valueFrom":{"secretKeyRef":{"name":"tct-suite-api-b6a7d322e01b51ec","key":"DB_USER","optional":false}}},{"name":"IMPLENTIO_API_KEY","valueFrom":{"secretKeyRef":{"name":"tct-suite-api-b6a7d322e01b51ec","key":"IMPLENTIO_API_KEY","optional":false}}},{"name":"OAUTH_TOKEN","valueFrom":{"secretKeyRef":{"name":"tct-suite-api-b6a7d322e01b51ec","key":"OAUTH_TOKEN","optional":false}}},{"name":"OAUTH_TOKEN_SECRET","valueFrom":{"secretKeyRef":{"name":"tct-suite-api-b6a7d322e01b51ec","key":"OAUTH_TOKEN_SECRET","optional":false}}},{"name":"SNOWFLAKE_PASSWORD","valueFrom":{"secretKeyRef":{"name":"tct-suite-api-b6a7d322e01b51ec","key":"SNOWFLAKE_PASSWORD","optional":false}}},{"name":"SNOWFLAKE_USER","valueFrom":{"secretKeyRef":{"name":"tct-suite-api-b6a7d322e01b51ec","key":"SNOWFLAKE_USER","optional":false}}}],"resources":{"limits":{"memory":"10Gi"},"requests":{"cpu":"250m","memory":"300Mi"}},"volumeMounts":[{"name":"podinfo","mountPath":"/etc/podinfo"}],"securityContext":{"capabilities":{"drop":["ALL"]},"privileged":false,"runAsUser":1000,"runAsGroup":1000,"runAsNonRoot":true,"readOnlyRootFilesystem":true,"allowPrivilegeEscalation":false}},"archiveLocation":{"archiveLogs":true,"s3":{"endpoint":"s3.amazonaws.com","bucket":"argo-5f172fe57e0cb410","region":"us-west-2","key":"tct-suite-api-8d6p4/tct-suite-api-8d6p4"}},"tolerations":[{"key":"spot","operator":"Equal","value":"true","effect":"NoSchedule"},{"key":"burstable","operator":"Equal","value":"true","effect":"NoSchedule"},{"key":"arm64","operator":"Equal","value":"true","effect":"NoSchedule"}],"schedulerName":"panfactum","serviceAccountName":"tct-suite-api-b6a7d322e01b51ec"}
      ARGO_NODE_ID:                         tct-suite-api-8d6p4
      ARGO_INCLUDE_SCRIPT_OUTPUT:           false
      ARGO_DEADLINE:                        0001-01-01T00:00:00Z
      ARGO_PROGRESS_FILE:                   /var/run/argo/progress
      ARGO_PROGRESS_PATCH_TICK_DURATION:    1m0s
      ARGO_PROGRESS_FILE_TICK_DURATION:     3s
      AWS_STS_REGIONAL_ENDPOINTS:           regional
      AWS_ROLE_ARN:                         arn:aws:iam::590183845935:role/tct-suite-api-b6a7d322e01b51ec-20240920213059807400000003
      AWS_WEB_IDENTITY_TOKEN_FILE:          /var/run/secrets/eks.amazonaws.com/serviceaccount/token
    Mounts:
      /etc/podinfo from podinfo (rw)
      /var/run/argo from var-run-argo (rw)
      /var/run/secrets/eks.amazonaws.com/serviceaccount from aws-iam-token (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-hs877 (ro)
Conditions:
  Type                        Status
  PodReadyToStartContainers   False 
  Initialized                 True 
  Ready                       False 
  ContainersReady             False 
  PodScheduled                True 
Volumes:
  aws-iam-token:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  86400
  var-run-argo:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  tmp-dir-argo:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  podinfo:
    Type:  DownwardAPI (a volume populated by information about the pod)
    Items:
      metadata.labels -> labels
      metadata.annotations -> annotations
  kube-api-access-hs877:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
  linkerd-proxy-init-xtables-lock:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  linkerd-identity-end-entity:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     Memory
    SizeLimit:  <unset>
  linkerd-identity-token:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  86400
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 arm64=true:NoSchedule
                             burstable=true:NoSchedule
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
                             spot=true:NoSchedule
Events:
  Type     Reason     Age                 From       Message
  ----     ------     ----                ----       -------
  Normal   Scheduled  4m52s               panfactum  Successfully assigned implentio/tct-suite-api-8d6p4 to ip-10-0-190-175.us-west-2.compute.internal
  Normal   Pulled     4m52s               kubelet    Container image "590183845935.dkr.ecr.us-west-2.amazonaws.com/github/linkerd/proxy-init:v2.4.0" already present on machine
  Normal   Created    4m52s               kubelet    Created container linkerd-init
  Normal   Started    4m52s               kubelet    Started container linkerd-init
  Normal   Pulled     4m52s               kubelet    Container image "590183845935.dkr.ecr.us-west-2.amazonaws.com/github/linkerd/proxy:edge-24.5.1" already present on machine
  Normal   Created    4m52s               kubelet    Created container linkerd-proxy
  Normal   Started    4m52s               kubelet    Started container linkerd-proxy
  Normal   Pulling    4m50s               kubelet    Pulling image "590183845935.dkr.ecr.us-west-2.amazonaws.com/quay/argoproj/argoexec:v3.5.5"
  Normal   Pulled     4m49s               kubelet    Successfully pulled image "590183845935.dkr.ecr.us-west-2.amazonaws.com/quay/argoproj/argoexec:v3.5.5" in 986ms (986ms including waiting)
  Normal   Created    4m49s               kubelet    Created container init
  Normal   Started    4m49s               kubelet    Started container init
  Normal   Pulling    4m48s               kubelet    Pulling image "590183845935.dkr.ecr.us-west-2.amazonaws.com/quay/argoproj/argoexec:v3.5.5"
  Normal   Pulled     4m47s               kubelet    Successfully pulled image "590183845935.dkr.ecr.us-west-2.amazonaws.com/quay/argoproj/argoexec:v3.5.5" in 560ms (560ms including waiting)
  Normal   Created    4m47s               kubelet    Created container wait
  Normal   Started    4m47s               kubelet    Started container wait
  Normal   Pulled     4m47s               kubelet    Container image "730335560480.dkr.ecr.us-west-2.amazonaws.com/suite-api:b818506f1be43f4b0879476b15ab01430ed82081" already present on machine
  Normal   Created    4m47s               kubelet    Created container main
  Normal   Started    4m47s               kubelet    Started container main
  Normal   Killing    114s                kubelet    Stopping container linkerd-proxy
  Warning  Unhealthy  93s (x3 over 113s)  kubelet    Liveness probe failed: Get "http://10.0.176.148:4191/live": dial tcp 10.0.176.148:4191: connect: connection refused
  Warning  Unhealthy  93s (x3 over 113s)  kubelet    Readiness probe failed: Get "http://10.0.176.148:4191/ready": dial tcp 10.0.176.148:4191: connect: connection refused
wesbragagt commented 1 month ago

Short example of logs from the workflow:

│ main 2024-10-01 21:31:37,946 - [INFO] client.py:69:run_query() - Request took 0:00:09.436582         │
│ main 2024-10-01 21:31:37,950 - [INFO] client.py:83:run_query() - Fewer results than page_size, stopp │
│ main 2024-10-01 21:31:37,950 - [INFO] client.py:94:run_query() - Fetched 1204 records. Total so far: │
│ main 2024-10-01 21:31:38,560 - [INFO] netsuiteData.py:307:main() - Uploaded 66204 records to s3://sn │
│ main 2024-10-01 21:31:38,560 - [INFO] netsuiteData.py:309:main() - All queries completed successfull │
│ main 2024-10-01 21:31:38,626 - [INFO] netsuiteData.py:320:main() - Published message to SNS topic IN │
│ main 2024-10-01 21:31:38,640 - [INFO] netsuiteData.py:328:<module>() - Total time taken: 0:02:44.688 │
│ main {"argo":true,"error":null,"level":"info","msg":"sub-process exited","time":"2024-10-01T21:31:39 │
wesbragagt commented 1 month ago

@fullykubed I added to the workflow metadata and the cronjob now passed.

annotations:
    config.linkerd.io/shutdown-grace-period: 30s

Manifest:

name: entry
inputs:
  parameters:
    - name: image_version
      default: '{{workflow.parameters.image_version}}'
      description: ''
    - name: brand_slug
      default: '{{workflow.parameters.brand_slug}}'
      enum:
        - tct
      description: ''
    - name: s3_bucket
      default: '{{workflow.parameters.s3_bucket}}'
      description: ''
    - name: start_date
      default: '{{workflow.parameters.start_date}}'
      description: 'Format: YYYY-MM-DD'
    - name: end_date
      default: '{{workflow.parameters.end_date}}'
      description: 'Format: YYYY-MM-DD'
outputs: {}
affinity:
  nodeAffinity: {}
metadata:
  annotations:
    config.linkerd.io/shutdown-grace-period: 30s
container:
  name: ''
  image: >-
    730335560480.dkr.ecr.us-west-2.amazonaws.com/suite-api:{{inputs.parameters.image_version}}
  command:
    - python
    - /app/netsuiteData.py
    - '{{inputs.parameters.brand_slug}}'
    - '{{inputs.parameters.s3_bucket}}'
    - '-s {{inputs.parameters.start_date}}'
    - '-e {{inputs.parameters.end_date}}'
  env:
    - name: POD_IP
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: status.podIP
    - name: POD_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.name
    - name: POD_NAMESPACE
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.namespace
    - name: NAMESPACE
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.namespace
    - name: POD_SERVICE_ACCOUNT
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: spec.serviceAccountName
    - name: NODE_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: spec.nodeName
    - name: NODE_IP
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: status.hostIP
    - name: CONTAINER_CPU_REQUEST
      valueFrom:
        resourceFieldRef:
          resource: requests.cpu
          divisor: '0'
    - name: CONTAINER_MEMORY_REQUEST
      valueFrom:
        resourceFieldRef:
          resource: requests.memory
          divisor: '0'
    - name: CONTAINER_MEMORY_LIMIT
      valueFrom:
        resourceFieldRef:
          resource: limits.memory
          divisor: '0'
    - name: CONTAINER_EPHEMERAL_STORAGE_REQUEST
      valueFrom:
        resourceFieldRef:
          resource: requests.ephemeral-storage
          divisor: '0'
    - name: CONTAINER_EPHEMERAL_STORAGE_LIMIT
      valueFrom:
        resourceFieldRef:
          resource: limits.ephemeral-storage
          divisor: '0'
    - name: AWS_ACCOUNT_ID
      value: '590183845935'
    - name: AWS_REGION
      value: us-west-2
    - name: BASE_URL
      value: >-
        https://6466422.restlets.api.netsuite.com/app/site/hosting/restlet.nl?script=1423&deploy=1
    - name: DB_HOST
      value: pg-88cc-pooler-rw.implentio
    - name: DB_NAME
      value: app
    - name: DB_PORT
      value: '5432'
    - name: IMPLENTIO_API_BASE_URL
      value: https://api.prod.implentio.net
    - name: REALM
      value: '6466422'
    - name: SNOWFLAKE_ACCOUNT
      value: izb31483.prod3.us-west-2.aws
    - name: SNOWFLAKE_DATABASE
      value: IMPLENTIO_RAW
    - name: SNOWFLAKE_SCHEMA
      value: PUBLIC
    - name: SNOWFLAKE_WAREHOUSE
      value: WH_APPLICATION_XSM
    - name: CONSUMER_KEY
      valueFrom:
        secretKeyRef:
          name: tct-suite-api-b6a7d322e01b51ec
          key: CONSUMER_KEY
          optional: false
    - name: CONSUMER_SECRET
      valueFrom:
        secretKeyRef:
          name: tct-suite-api-b6a7d322e01b51ec
          key: CONSUMER_SECRET
          optional: false
    - name: DB_PASSWORD
      valueFrom:
        secretKeyRef:
          name: tct-suite-api-b6a7d322e01b51ec
          key: DB_PASSWORD
          optional: false
    - name: DB_USER
      valueFrom:
        secretKeyRef:
          name: tct-suite-api-b6a7d322e01b51ec
          key: DB_USER
          optional: false
    - name: IMPLENTIO_API_KEY
      valueFrom:
        secretKeyRef:
          name: tct-suite-api-b6a7d322e01b51ec
          key: IMPLENTIO_API_KEY
          optional: false
    - name: OAUTH_TOKEN
      valueFrom:
        secretKeyRef:
          name: tct-suite-api-b6a7d322e01b51ec
          key: OAUTH_TOKEN
          optional: false
    - name: OAUTH_TOKEN_SECRET
      valueFrom:
        secretKeyRef:
          name: tct-suite-api-b6a7d322e01b51ec
          key: OAUTH_TOKEN_SECRET
          optional: false
    - name: SNOWFLAKE_PASSWORD
      valueFrom:
        secretKeyRef:
          name: tct-suite-api-b6a7d322e01b51ec
          key: SNOWFLAKE_PASSWORD
          optional: false
    - name: SNOWFLAKE_USER
      valueFrom:
        secretKeyRef:
          name: tct-suite-api-b6a7d322e01b51ec
          key: SNOWFLAKE_USER
          optional: false
  resources:
    limits:
      memory: 10Gi
    requests:
      cpu: 250m
      memory: 300Mi
  volumeMounts:
    - name: podinfo
      mountPath: /etc/podinfo
  securityContext:
    capabilities:
      drop:
        - ALL
    privileged: false
    runAsUser: 1000
    runAsGroup: 1000
    runAsNonRoot: true
    readOnlyRootFilesystem: true
    allowPrivilegeEscalation: false
tolerations:
  - key: spot
    operator: Equal
    value: 'true'
    effect: NoSchedule
  - key: burstable
    operator: Equal
    value: 'true'
    effect: NoSchedule
  - key: arm64
    operator: Equal
    value: 'true'
    effect: NoSchedule
schedulerName: panfactum
serviceAccountName: tct-suite-api-b6a7d322e01b51ec
fullykubed commented 1 month ago

@wesbragagt For more information on managing Linkerd in Panfactum, you should reference these docs.

However, Linkerd is a red herring in this case.

The real problem here is that your main container has a memory limit that is >10x the memory request. The memory limit should never be >1.3x the request, and ideally they should be the same.

Due to the above, you are actually breaking the underlying node because you are allowing the pod to exhaust the node's memory which will cause weird behavior. This manifests as pods crashing randomly, pods not being able to clean up when they terminate, and containers with exit code 137. Fix this and the problems will go away.

wesbragagt commented 1 month ago

@fullykubed Thank you so much. That was indeed the issue.

wesbragagt commented 1 month ago

@fullykubed I'm still noticing the same issue after modifying the memory request

name: entry
inputs:
  parameters:
    - name: image_version
      default: '{{workflow.parameters.image_version}}'
      description: ''
    - name: brand_slug
      default: '{{workflow.parameters.brand_slug}}'
      enum:
        - tct
      description: ''
    - name: s3_bucket
      default: '{{workflow.parameters.s3_bucket}}'
      description: ''
    - name: start_date
      default: '{{workflow.parameters.start_date}}'
      description: 'Format: YYYY-MM-DD'
    - name: end_date
      default: '{{workflow.parameters.end_date}}'
      description: 'Format: YYYY-MM-DD'
outputs: {}
affinity:
  nodeAffinity: {}
metadata: {}
container:
  name: ''
  image: >-
    730335560480.dkr.ecr.us-west-2.amazonaws.com/suite-api:{{inputs.parameters.image_version}}
  command:
    - python
    - /app/netsuiteData.py
    - '{{inputs.parameters.brand_slug}}'
    - '{{inputs.parameters.s3_bucket}}'
    - '-s {{inputs.parameters.start_date}}'
    - '-e {{inputs.parameters.end_date}}'
  env:
    - name: POD_IP
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: status.podIP
    - name: POD_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.name
    - name: POD_NAMESPACE
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.namespace
    - name: NAMESPACE
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.namespace
    - name: POD_SERVICE_ACCOUNT
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: spec.serviceAccountName
    - name: NODE_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: spec.nodeName
    - name: NODE_IP
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: status.hostIP
    - name: CONTAINER_CPU_REQUEST
      valueFrom:
        resourceFieldRef:
          resource: requests.cpu
          divisor: '0'
    - name: CONTAINER_MEMORY_REQUEST
      valueFrom:
        resourceFieldRef:
          resource: requests.memory
          divisor: '0'
    - name: CONTAINER_MEMORY_LIMIT
      valueFrom:
        resourceFieldRef:
          resource: limits.memory
          divisor: '0'
    - name: CONTAINER_EPHEMERAL_STORAGE_REQUEST
      valueFrom:
        resourceFieldRef:
          resource: requests.ephemeral-storage
          divisor: '0'
    - name: CONTAINER_EPHEMERAL_STORAGE_LIMIT
      valueFrom:
        resourceFieldRef:
          resource: limits.ephemeral-storage
          divisor: '0'
    - name: AWS_ACCOUNT_ID
      value: '590183845935'
    - name: AWS_REGION
      value: us-west-2
    - name: BASE_URL
      value: >-
        https://6466422.restlets.api.netsuite.com/app/site/hosting/restlet.nl?script=1423&deploy=1
    - name: DB_HOST
      value: pg-88cc-pooler-rw.implentio
    - name: DB_NAME
      value: app
    - name: DB_PORT
      value: '5432'
    - name: IMPLENTIO_API_BASE_URL
      value: https://api.prod.implentio.net
    - name: REALM
      value: '6466422'
    - name: SNOWFLAKE_ACCOUNT
      value: izb31483.prod3.us-west-2.aws
    - name: SNOWFLAKE_DATABASE
      value: IMPLENTIO_RAW
    - name: SNOWFLAKE_SCHEMA
      value: PUBLIC
    - name: SNOWFLAKE_WAREHOUSE
      value: WH_APPLICATION_XSM
    - name: CONSUMER_KEY
      valueFrom:
        secretKeyRef:
          name: tct-suite-api-b6a7d322e01b51ec
          key: CONSUMER_KEY
          optional: false
    - name: CONSUMER_SECRET
      valueFrom:
        secretKeyRef:
          name: tct-suite-api-b6a7d322e01b51ec
          key: CONSUMER_SECRET
          optional: false
    - name: DB_PASSWORD
      valueFrom:
        secretKeyRef:
          name: tct-suite-api-b6a7d322e01b51ec
          key: DB_PASSWORD
          optional: false
    - name: DB_USER
      valueFrom:
        secretKeyRef:
          name: tct-suite-api-b6a7d322e01b51ec
          key: DB_USER
          optional: false
    - name: IMPLENTIO_API_KEY
      valueFrom:
        secretKeyRef:
          name: tct-suite-api-b6a7d322e01b51ec
          key: IMPLENTIO_API_KEY
          optional: false
    - name: OAUTH_TOKEN
      valueFrom:
        secretKeyRef:
          name: tct-suite-api-b6a7d322e01b51ec
          key: OAUTH_TOKEN
          optional: false
    - name: OAUTH_TOKEN_SECRET
      valueFrom:
        secretKeyRef:
          name: tct-suite-api-b6a7d322e01b51ec
          key: OAUTH_TOKEN_SECRET
          optional: false
    - name: SNOWFLAKE_PASSWORD
      valueFrom:
        secretKeyRef:
          name: tct-suite-api-b6a7d322e01b51ec
          key: SNOWFLAKE_PASSWORD
          optional: false
    - name: SNOWFLAKE_USER
      valueFrom:
        secretKeyRef:
          name: tct-suite-api-b6a7d322e01b51ec
          key: SNOWFLAKE_USER
          optional: false
  resources:
    limits:
      memory: 666Mi
    requests:
      cpu: 250m
      memory: 512Mi
  volumeMounts:
    - name: podinfo
      mountPath: /etc/podinfo
  securityContext:
    capabilities:
      drop:
        - ALL
    privileged: false
    runAsUser: 1000
    runAsGroup: 1000
    runAsNonRoot: true
    readOnlyRootFilesystem: true
    allowPrivilegeEscalation: false
tolerations:
  - key: spot
    operator: Equal
    value: 'true'
    effect: NoSchedule
  - key: burstable
    operator: Equal
    value: 'true'
    effect: NoSchedule
  - key: arm64
    operator: Equal
    value: 'true'
    effect: NoSchedule
schedulerName: panfactum
serviceAccountName: tct-suite-api-b6a7d322e01b51ec
fullykubed commented 1 month ago
  1. Please keep memory limits = requests unless you know what the implications are.
  2. When debugging, we need to see the full yaml manifests of the related resources in order to provide any insightful information.
wesbragagt commented 1 month ago

Turns out the annotation I was passing was to workflow_annotations instead of extra_pod_annotations. If you want to pass this annotations to fix https://github.com/linkerd/linkerd2/issues/8033 in your Argo workflows, pass it as:

extra_pod_annotations = {
  "config.linkerd.io/shutdown-grace-period": "30s"
}