envoyproxy / gateway

Manages Envoy Proxy as a Standalone or Kubernetes-based Application Gateway
https://gateway.envoyproxy.io
Apache License 2.0
1.45k stars 297 forks source link

gateway pod suddenly died. #3511

Open gecube opened 1 month ago

gecube commented 1 month ago

Good day!

I was playing around quickstart. I definitely did not do something weird. But suddenly I got the next:

2024-05-31T16:47:15.757Z    INFO    admin   admin/server.go:37  starting admin server   {"address": "127.0.0.1:19000", "enablePprof": false}
2024-05-31T16:47:15.757Z    INFO    metrics metrics/register.go:165 initialized metrics pull endpoint   {"address": "0.0.0.0:19001", "endpoint": "/metrics"}
2024-05-31T16:47:15.758Z    INFO    metrics metrics/register.go:54  starting metrics server {"address": "0.0.0.0:19001"}
2024-05-31T16:47:15.758Z    INFO    provider    runner/runner.go:41 Using provider  {"runner": "provider", "type": "Kubernetes"}
2024-05-31T16:47:15.759Z    INFO    provider    kubernetes/controller.go:104    created gatewayapi controller   {"runner": "provider"}
I0531 16:47:16.827924       1 request.go:697] Waited for 1.037175908s due to client-side throttling, not priority and fairness, request: GET:https://10.84.0.1:443/apis/flowcontrol.apiserver.k8s.io/v1beta2?timeout=32s
2024-05-31T16:47:17.831Z    INFO    provider    kubernetes/controller.go:1154   ServiceImport CRD not found, skipping ServiceImport watch   {"runner": "provider"}
2024-05-31T16:47:17.844Z    INFO    provider    kubernetes/controller.go:1429   Watching gatewayAPI related objects {"runner": "provider"}
2024-05-31T16:47:17.844Z    INFO    gateway-api runner/runner.go:49 started {"runner": "gateway-api"}
2024-05-31T16:47:17.844Z    INFO    xds-translator  runner/runner.go:47 started {"runner": "xds-translator"}
2024-05-31T16:47:17.845Z    INFO    xds-server  runner/runner.go:193    loaded TLS certificate and key  {"runner": "xds-server"}
2024-05-31T16:47:17.845Z    INFO    xds-server  runner/runner.go:93 started {"runner": "xds-server"}
2024-05-31T16:47:17.945Z    INFO    provider    controller/controller.go:173    Starting EventSource    {"runner": "provider", "controller": "gatewayapi", "source": "&{%!s(<-chan struct {}=0xc0008be420) %!s(*v1.GatewayClass=&{{ } {      0 {{0 0 <nil>}} <nil> <nil> map[] map[] [] [] []} { <nil> <nil>} {[] []}}) %!s(*handler.enqueueRequestsFromMapFunc[sigs.k8s.io/controller-runtime/pkg/client.Object]=&{0x20c6d40})}"}
2024-05-31T16:47:17.945Z    INFO    provider    controller/controller.go:173    Starting EventSource    {"runner": "provider", "controller": "gatewayapi", "source": "kind source: *v1.GatewayClass"}
2024-05-31T16:47:17.945Z    INFO    provider    controller/controller.go:173    Starting EventSource    {"runner": "provider", "controller": "gatewayapi", "source": "kind source: *v1alpha1.EnvoyProxy"}
2024-05-31T16:47:17.945Z    INFO    provider    controller/controller.go:173    Starting EventSource    {"runner": "provider", "controller": "gatewayapi", "source": "kind source: *v1.Gateway"}
2024-05-31T16:47:17.945Z    INFO    provider    controller/controller.go:173    Starting EventSource    {"runner": "provider", "controller": "gatewayapi", "source": "kind source: *v1.HTTPRoute"}
2024-05-31T16:47:17.945Z    INFO    provider    controller/controller.go:173    Starting EventSource    {"runner": "provider", "controller": "gatewayapi", "source": "kind source: *v1.GRPCRoute"}
I0531 16:47:17.945468       1 leaderelection.go:250] attempting to acquire leader lease envoy-gateway-system/5b9825d2.gateway.envoyproxy.io...
2024-05-31T16:47:17.945Z    INFO    provider    controller/controller.go:173    Starting EventSource    {"runner": "provider", "controller": "gatewayapi", "source": "kind source: *v1alpha2.TLSRoute"}
2024-05-31T16:47:17.945Z    INFO    provider    controller/controller.go:173    Starting EventSource    {"runner": "provider", "controller": "gatewayapi", "source": "kind source: *v1alpha2.UDPRoute"}
2024-05-31T16:47:17.945Z    INFO    provider    controller/controller.go:173    Starting EventSource    {"runner": "provider", "controller": "gatewayapi", "source": "kind source: *v1alpha2.TCPRoute"}
2024-05-31T16:47:17.945Z    INFO    provider    controller/controller.go:173    Starting EventSource    {"runner": "provider", "controller": "gatewayapi", "source": "kind source: *v1.Service"}
2024-05-31T16:47:17.945Z    INFO    provider    controller/controller.go:173    Starting EventSource    {"runner": "provider", "controller": "gatewayapi", "source": "kind source: *v1.EndpointSlice"}
2024-05-31T16:47:17.945Z    INFO    provider    controller/controller.go:173    Starting EventSource    {"runner": "provider", "controller": "gatewayapi", "source": "kind source: *v1.Node"}
2024-05-31T16:47:17.945Z    INFO    provider    controller/controller.go:173    Starting EventSource    {"runner": "provider", "controller": "gatewayapi", "source": "kind source: *v1.Secret"}
2024-05-31T16:47:17.945Z    INFO    provider    controller/controller.go:173    Starting EventSource    {"runner": "provider", "controller": "gatewayapi", "source": "kind source: *v1.ConfigMap"}
2024-05-31T16:47:17.945Z    INFO    provider    controller/controller.go:173    Starting EventSource    {"runner": "provider", "controller": "gatewayapi", "source": "kind source: *v1beta1.ReferenceGrant"}
2024-05-31T16:47:17.945Z    INFO    provider    controller/controller.go:173    Starting EventSource    {"runner": "provider", "controller": "gatewayapi", "source": "kind source: *v1.Deployment"}
2024-05-31T16:47:17.946Z    INFO    provider    controller/controller.go:173    Starting EventSource    {"runner": "provider", "controller": "gatewayapi", "source": "kind source: *v1alpha1.ClientTrafficPolicy"}
2024-05-31T16:47:17.946Z    INFO    provider    controller/controller.go:173    Starting EventSource    {"runner": "provider", "controller": "gatewayapi", "source": "kind source: *v1alpha1.BackendTrafficPolicy"}
2024-05-31T16:47:17.946Z    INFO    provider    controller/controller.go:173    Starting EventSource    {"runner": "provider", "controller": "gatewayapi", "source": "kind source: *v1alpha1.SecurityPolicy"}
2024-05-31T16:47:17.946Z    INFO    provider    controller/controller.go:173    Starting EventSource    {"runner": "provider", "controller": "gatewayapi", "source": "kind source: *v1alpha3.BackendTLSPolicy"}
2024-05-31T16:47:17.946Z    INFO    provider    controller/controller.go:173    Starting EventSource    {"runner": "provider", "controller": "gatewayapi", "source": "kind source: *v1alpha1.EnvoyExtensionPolicy"}
2024-05-31T16:47:17.946Z    INFO    provider    controller/controller.go:181    Starting Controller {"runner": "provider", "controller": "gatewayapi"}
2024-05-31T16:47:18.057Z    INFO    provider    kubernetes/predicates.go:39 gatewayclass has matching controller name, processing   {"runner": "provider", "name": "eg"}
2024-05-31T16:47:18.463Z    INFO    provider    controller/controller.go:215    Starting workers    {"runner": "provider", "controller": "gatewayapi", "worker count": 1}
2024-05-31T16:47:18.464Z    INFO    provider    kubernetes/controller.go:158    reconciling gateways    {"runner": "provider"}
2024-05-31T16:47:18.464Z    INFO    provider    kubernetes/controller.go:772    processing Gateway  {"runner": "provider", "namespace": "envoy-gateway-system", "name": "eg"}
2024-05-31T16:47:18.464Z    INFO    provider    kubernetes/controller.go:598    processing Secret   {"runner": "provider", "namespace": "envoy-gateway-system", "name": "eg-https"}
2024-05-31T16:47:18.464Z    INFO    provider    kubernetes/routes.go:268    processing HTTPRoute    {"runner": "provider", "namespace": "envoy-gateway-system", "name": "backend"}
2024-05-31T16:47:18.565Z    INFO    provider    kubernetes/controller.go:598    processing Secret   {"runner": "provider", "namespace": "envoy-gateway-system", "name": "my-app-client-secret"}
2024-05-31T16:47:18.565Z    INFO    provider    kubernetes/controller.go:545    processing OIDC HMAC Secret {"runner": "provider", "namespace": "envoy-gateway-system", "name": "envoy-oidc-hmac"}
2024-05-31T16:47:18.565Z    INFO    provider    kubernetes/controller.go:358    processing Backend  {"runner": "provider", "kind": "Service", "namespace": "envoy-gateway-system", "name": "backend"}
2024-05-31T16:47:18.565Z    INFO    provider    kubernetes/controller.go:372    added Service to resource tree  {"runner": "provider", "namespace": "envoy-gateway-system", "name": "backend"}
2024-05-31T16:47:18.565Z    INFO    provider    kubernetes/controller.go:406    added EndpointSlice to resource tree    {"runner": "provider", "namespace": "envoy-gateway-system", "name": "backend-g77bj"}
2024-05-31T16:47:18.666Z    INFO    provider    kubernetes/controller.go:298    reconciled gateways successfully    {"runner": "provider"}
2024-05-31T16:47:18.666Z    INFO    gateway-api runner/runner.go:56 received an update  {"runner": "gateway-api"}
2024-05-31T16:47:34.925Z    INFO    v3/simple.go:571    open delta watch ID:1 for type.googleapis.com/envoy.config.cluster.v3.Cluster Resources:map[] from nodeID: "envoy-envoy-gateway-system-eg-5391c79d-79fdb5d7f-wzlxh"
2024-05-31T16:47:34.925Z    INFO    v3/simple.go:571    open delta watch ID:2 for type.googleapis.com/envoy.config.listener.v3.Listener Resources:map[] from nodeID: "envoy-envoy-gateway-system-eg-5391c79d-79fdb5d7f-wzlxh"
I0531 16:47:35.220972       1 leaderelection.go:260] successfully acquired lease envoy-gateway-system/5b9825d2.gateway.envoyproxy.io
2024-05-31T16:47:35.221Z    INFO    provider    kubernetes/status_updater.go:129    started status update handler   {"runner": "provider"}
2024-05-31T16:47:35.221Z    INFO    infrastructure  runner/runner.go:54 started {"runner": "infrastructure"}
2024-05-31T16:47:35.221Z    INFO    provider    kubernetes/controller.go:158    reconciling gateways    {"runner": "provider"}
2024-05-31T16:47:35.221Z    INFO    provider    kubernetes/controller.go:772    processing Gateway  {"runner": "provider", "namespace": "envoy-gateway-system", "name": "eg"}
2024-05-31T16:47:35.221Z    INFO    provider    kubernetes/controller.go:598    processing Secret   {"runner": "provider", "namespace": "envoy-gateway-system", "name": "eg-https"}
2024-05-31T16:47:35.221Z    INFO    provider    kubernetes/routes.go:268    processing HTTPRoute    {"runner": "provider", "namespace": "envoy-gateway-system", "name": "backend"}
2024-05-31T16:47:35.222Z    INFO    provider    kubernetes/controller.go:598    processing Secret   {"runner": "provider", "namespace": "envoy-gateway-system", "name": "my-app-client-secret"}
2024-05-31T16:47:35.222Z    INFO    provider    kubernetes/controller.go:545    processing OIDC HMAC Secret {"runner": "provider", "namespace": "envoy-gateway-system", "name": "envoy-oidc-hmac"}
2024-05-31T16:47:35.222Z    INFO    provider    kubernetes/controller.go:358    processing Backend  {"runner": "provider", "kind": "Service", "namespace": "envoy-gateway-system", "name": "backend"}
2024-05-31T16:47:35.222Z    INFO    provider    kubernetes/controller.go:372    added Service to resource tree  {"runner": "provider", "namespace": "envoy-gateway-system", "name": "backend"}
2024-05-31T16:47:35.222Z    INFO    provider    kubernetes/controller.go:406    added EndpointSlice to resource tree    {"runner": "provider", "namespace": "envoy-gateway-system", "name": "backend-g77bj"}
2024-05-31T16:47:35.222Z    INFO    provider    kubernetes/controller.go:298    reconciled gateways successfully    {"runner": "provider"}
2024-05-31T16:47:35.222Z    INFO    provider    kubernetes/status_updater.go:140    received a status update    {"runner": "provider", "namespace": "", "name": "eg"}
2024-05-31T16:47:35.222Z    INFO    provider.eg kubernetes/status_updater.go:104    status unchanged, bypassing update  {"runner": "provider"}
2024-05-31T16:47:37.450Z    INFO    v3/simple.go:571    open delta watch ID:3 for type.googleapis.com/envoy.config.cluster.v3.Cluster Resources:map[] from nodeID: "envoy-envoy-gateway-system-eg-5391c79d-79fdb5d7f-qv5dr"
2024-05-31T16:47:37.451Z    INFO    v3/simple.go:571    open delta watch ID:4 for type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment Resources:map[httproute/envoy-gateway-system/backend/rule/0:{}] from nodeID: "envoy-envoy-gateway-system-eg-5391c79d-79fdb5d7f-qv5dr"
2024-05-31T16:47:37.452Z    INFO    v3/simple.go:571    open delta watch ID:5 for type.googleapis.com/envoy.config.listener.v3.Listener Resources:map[] from nodeID: "envoy-envoy-gateway-system-eg-5391c79d-79fdb5d7f-qv5dr"
2024-05-31T16:47:37.452Z    INFO    v3/simple.go:571    open delta watch ID:6 for type.googleapis.com/envoy.config.route.v3.RouteConfiguration Resources:map[envoy-gateway-system/eg/http:{} envoy-gateway-system/eg/https:{}] from nodeID: "envoy-envoy-gateway-system-eg-5391c79d-79fdb5d7f-qv5dr"
2024-05-31T16:47:37.453Z    INFO    v3/simple.go:571    open delta watch ID:7 for type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.Secret Resources:map[envoy-gateway-system/eg-https:{}] from nodeID: "envoy-envoy-gateway-system-eg-5391c79d-79fdb5d7f-qv5dr"
2024-05-31T16:47:48.670Z    INFO    gateway-api runner/runner.go:104    proxy:
  listeners:
  - address: null
    name: envoy-gateway-system/eg/http
    ports:
    - containerPort: 10080
      name: http-80
      protocol: HTTP
      servicePort: 80
  - address: null
    name: envoy-gateway-system/eg/https
    ports:
    - containerPort: 10443
      name: https-443
      protocol: HTTPS
      servicePort: 443
  metadata:
    labels:
      gateway.envoyproxy.io/owning-gateway-name: eg
      gateway.envoyproxy.io/owning-gateway-namespace: envoy-gateway-system
  name: envoy-gateway-system/eg
    {"runner": "gateway-api", "infra-ir": "envoy-gateway-system/eg"}
2024-05-31T16:47:48.670Z    INFO    infrastructure  runner/runner.go:78 received an update  {"runner": "infrastructure"}
2024-05-31T16:47:48.672Z    INFO    gateway-api runner/runner.go:115    accessLog:
  text:
  - path: /dev/stdout
http:
- address: 0.0.0.0
  hostnames:
fatal error: concurrent map writes
fatal error: concurrent map writes
  - '*'
  isHTTP2: false
  name: envoy-gateway-system/eg/http
  path:
    escapedSlashesAction: UnescapeAndRedirect
    mergeSlashes: true
  port: 10080
  routes:
  - destination:
      name: httproute/envoy-gateway-system/backend/rule/0
      settings:
      - addressType: IP
        endpoints:
        - host: 10.80.3.41
          port: 3000
        protocol: HTTP
        weight: 1
    hostname: www.example.com
    isHTTP2: false
    name: httproute/envoy-gateway-system/backend/rule/0/match/0/www_example_com
    pathMatch:
      distinct: false
      name: ""
      prefix: /
    security: {}
  - destination:
      name: httproute/envoy-gateway-system/backend/rule/0
      settings:
      - addressType: IP
        endpoints:
        - host: 10.80.3.41
          port: 3000
        protocol: HTTP
        weight: 1
    hostname: k8s-envoygat-envoyenv-ae38800e16-008adbc61a160fe9.elb.eu-west-2.amazonaws.com
    isHTTP2: false
    name: httproute/envoy-gateway-system/backend/rule/0/match/0/k8s-envoygat-envoyenv-ae38800e16-008adbc61a160fe9_elb_eu-west-2_amazonaws_com
    pathMatch:
      distinct: false
      name: ""
      prefix: /
    security: {}
- address: 0.0.0.0
  hostnames:
  - k8s-envoygat-envoyenv-ae38800e16-008adbc61a160fe9.elb.eu-west-2.amazonaws.com
  isHTTP2: false
  name: envoy-gateway-system/eg/https
  path:
    escapedSlashesAction: UnescapeAndRedirect
    mergeSlashes: true
  port: 10443
  routes:
  - destination:
      name: httproute/envoy-gateway-system/backend/rule/0
      settings:
      - addressType: IP
        endpoints:
        - host: 10.80.3.41
          port: 3000
        protocol: HTTP
        weight: 1
    hostname: k8s-envoygat-envoyenv-ae38800e16-008adbc61a160fe9.elb.eu-west-2.amazonaws.com
    isHTTP2: false
    name: httproute/envoy-gateway-system/backend/rule/0/match/0/k8s-envoygat-envoyenv-ae38800e16-008adbc61a160fe9_elb_eu-west-2_amazonaws_com
    pathMatch:
      distinct: false
      name: ""
      prefix: /
    security: {}
  tls:
    certificates:
    - name: envoy-gateway-system/eg-https
      privateKey: W3JlZGFjdGVkXQ==
      serverCertificate: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURvVENDQW9tZ0F3SUJBZ0lRYU5ISnJaMXZ5YkpFNjlDVkNCZ1o2REFOQmdrcWhraUc5dzBCQVFzRkFEQkQKTVJrd0Z3WURWUVFLRXhCV1lYTjVZU0JRZFhCcmFXNGdUSFJrTVNZd0pBWURWUVFEREIwcUxtVnNZaTVsZFMxMwpaWE4wTFRJdVlXMWhlbTl1WVhkekxtTnZiVEFlRncweU5EQTFNekV4TmpNek1qVmFGdzB6TkRBMU1qa3hOak16Ck1qVmFNRU14R1RBWEJnTlZCQW9URUZaaGMzbGhJRkIxY0d0cGJpQk1kR1F4SmpBa0JnTlZCQU1NSFNvdVpXeGkKTG1WMUxYZGxjM1F0TWk1aGJXRjZiMjVoZDNNdVkyOXRNSUlCSWpBTkJna3Foa2lHOXcwQkFRRUZBQU9DQVE4QQpNSUlCQ2dLQ0FRRUE1ejhhbFBtOEk4OHJpY3lNMmMybVdhaWJ6RnFhUmYwaVFZTE1QQ2w5cVh6dDNJSWo4TDMyCnUvL2NZcTFXU1EyMzBGdUpuQW4rdGhWT09zSmtQQXA3Q0pzKzVNM0t0TFV3Q3Z4OG5VU0h4SGxEVEwvbFQzZTIKYlJmRW54K1pPaGpsYjl3SnV3NEFhbkhZUHh1VW11UFFyZFlBdWNtc05LMTRKeks5eXpPMnlNZnhPVzhtbHdvZgphcHFrVjY0OHorbWRobUdwS2hYemxKelNjc0dqUkNRNEhWWGVUZitoMFlVRExCV29IY0MzQ1M0SDFyT3RkUmRXCndjKzMvcVVrZVlVT2YrQXJWN3RMKzFDQWVEck9RTEg3VTd0NkQ5SXlGLzFJZ3hzSmZCemYwbjVPaXlGV0UvRHoKclc2ejJ5aTBXUlUxWTJkdERLc3M1NFdyRzkwcjFZYnZod0lEQVFBQm80R1FNSUdOTUE0R0ExVWREd0VCL3dRRQpBd0lGb0RBVEJnTlZIU1VFRERBS0JnZ3JCZ0VGQlFjREFUQU1CZ05WSFJNQkFmOEVBakFBTUZnR0ExVWRFUVJSCk1FK0NUV3M0Y3kxbGJuWnZlV2RoZEMxbGJuWnZlV1Z1ZGkxaFpUTTRPREF3WlRFMkxUQXdPR0ZrWW1NMk1XRXgKTmpCbVpUa3VaV3hpTG1WMUxYZGxjM1F0TWk1aGJXRjZiMjVoZDNNdVkyOXRNQTBHQ1NxR1NJYjNEUUVCQ3dVQQpBNElCQVFDY3kwTkdwb2RBOHZneGJYY2VOd1N5RFBQalpxdE9yQVpUWG1RMzEwYStaS0huNENuNGJ2YmVybytJCkxKS2thL3VMYlV0Nk1MN0VoeEdhVDBrd05MMHdtWG9nT1crMTJkY0hkN3dWckxaTFVBSG0zNEJoSDlKZU9IZTcKemhzeExBb2wzZkljSVFBTHFPT3ZLdjZXNi9lUzRvWDJsbGdtU1dzcGMwMUc3RVFEMGg1TXZGY2V0N1ZiWHZLNApDbXFMMVVUaURUN0VHK2FWR2Jqa0RkajBwcmVIc1k3NVR1NW1KeVlMNzZickhqaDlZWEgwV29mY0VzR2JaQmV5Ck85bEUrVEx6TWpWQXhvdWtLWjhMZU1sbXlsdGtFS2hxZWxsTmxYUHVMWi9rOFVTclEyYUx3c2xLOE5LWFVKWVAKMjlaaEdvKzJGaTJuUjJFa2djSWlvMjArSXRoOQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==
    {"runner": "gateway-api", "xds-ir": "envoy-gateway-system/eg"}
2024-05-31T16:47:48.672Z    INFO    provider    kubernetes/status_updater.go:140    received a status update    {"runner": "provider", "namespace": "envoy-gateway-system", "name": "oidc-example"}

goroutine 369 [running]:
github.com/envoyproxy/gateway/internal/metrics.(*Gauge).With(0xc0007876d0, {0xc0008d9d28?, 0x0?, 0x0?})
    /home/runner/work/gateway/gateway/internal/metrics/otel_metric_gauge.go:50 +0x31b
github.com/envoyproxy/gateway/internal/message.HandleSubscription[...]({{0x28ed35d, 0x2cb4938?}, {0x28df5bc?, 0x1008a6900?}}, 0xc00067c0c0?, 0xc001923f98)
    /home/runner/work/gateway/gateway/internal/message/watchutil.go:73 +0xf7c
github.com/envoyproxy/gateway/internal/xds/translator/runner.(*Runner).subscribeAndTranslate(0xc00040fb00, {0x2cb4938?, 0xc000798730?})
    /home/runner/work/gateway/gateway/internal/xds/translator/runner/runner.go:53 +0x85
created by github.com/envoyproxy/gateway/internal/xds/translator/runner.(*Runner).Start in goroutine 1
    /home/runner/work/gateway/gateway/internal/xds/translator/runner/runner.go:46 +0x255

goroutine 1 [chan receive]:
github.com/envoyproxy/gateway/internal/cmd.setupRunners(0xc000701860)
    /home/runner/work/gateway/gateway/internal/cmd/server.go:203 +0x6ac
github.com/envoyproxy/gateway/internal/cmd.server()
    /home/runner/work/gateway/gateway/internal/cmd/server.go:62 +0x54
github.com/envoyproxy/gateway/internal/cmd.getServerCommand.func1(0xc000794d00?, {0x28dd6f3?, 0x4?, 0x28dd5ff?})
    /home/runner/work/gateway/gateway/internal/cmd/server.go:36 +0xf
github.com/spf13/cobra.(*Command).execute(0xc000301b08, {0xc0002c4ee0, 0x1, 0x1})
    /home/runner/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:983 +0xaca
github.com/spf13/cobra.(*Command).ExecuteC(0xc000301808)
    /home/runner/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:1115 +0x3ff
github.com/spf13/cobra.(*Command).Execute(0x0?)
    /home/runner/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:1039 +0x13
main.main()
    /home/runner/work/gateway/gateway/cmd/envoy-gateway/main.go:16 +0x18

goroutine 23 [select]:
github.com/telepresenceio/watchable.(*Map[...]).coalesce(0x2cea080, {0x2cb49e0, 0x4589a80}, 0xc0002aa950, 0xc0006d80c0, 0xc0006d8120, 0xc00075a240)
    /home/runner/go/pkg/mod/github.com/telepresenceio/watchable@v0.0.0-20220726211108-9bb86f92afa7/map.go:382 +0x76c
created by github.com/telepresenceio/watchable.(*Map[...]).SubscribeSubset in goroutine 82
    /home/runner/go/pkg/mod/github.com/telepresenceio/watchable@v0.0.0-20220726211108-9bb86f92afa7/map.go:281 +0x14d

goroutine 67 [IO wait]:
internal/poll.runtime_pollWait(0x78336ad81eb0, 0x72)
    /opt/hostedtoolcache/go/1.22.3/x64/src/runtime/netpoll.go:345 +0x85
internal/poll.(*pollDesc).wait(0x3?, 0x10?, 0x0)
    /opt/hostedtoolcache/go/1.22.3/x64/src/internal/poll/fd_poll_runtime.go:84 +0x27
internal/poll.(*pollDesc).waitRead(...)
    /opt/hostedtoolcache/go/1.22.3/x64/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Accept(0xc00040e280)
    /opt/hostedtoolcache/go/1.22.3/x64/src/internal/poll/fd_unix.go:611 +0x2ac
net.(*netFD).accept(0xc00040e280)
    /opt/hostedtoolcache/go/1.22.3/x64/src/net/fd_unix.go:172 +0x29
net.(*TCPListener).accept(0xc0002ca9e0)
    /opt/hostedtoolcache/go/1.22.3/x64/src/net/tcpsock_posix.go:159 +0x1e
net.(*TCPListener).Accept(0xc0002ca9e0)
    /opt/hostedtoolcache/go/1.22.3/x64/src/net/tcpsock.go:327 +0x30
net/http.(*Server).Serve(0xc000834000, {0x2caa5d0, 0xc0002ca9e0})
    /opt/hostedtoolcache/go/1.22.3/x64/src/net/http/server.go:3255 +0x33e
net/http.(*Server).ListenAndServe(0xc000834000)
    /opt/hostedtoolcache/go/1.22.3/x64/src/net/http/server.go:3184 +0x71
github.com/envoyproxy/gateway/internal/admin.start.func1()
    /home/runner/work/gateway/gateway/internal/admin/server.go:59 +0x25
created by github.com/envoyproxy/gateway/internal/admin.start in goroutine 1
    /home/runner/work/gateway/gateway/internal/admin/server.go:58 +0x3ad

goroutine 69 [IO wait]:
internal/poll.runtime_pollWait(0x78336ad81db8, 0x72)
    /opt/hostedtoolcache/go/1.22.3/x64/src/runtime/netpoll.go:345 +0x85
internal/poll.(*pollDesc).wait(0x7?, 0x10?, 0x0)
    /opt/hostedtoolcache/go/1.22.3/x64/src/internal/poll/fd_poll_runtime.go:84 +0x27
internal/poll.(*pollDesc).waitRead(...)
    /opt/hostedtoolcache/go/1.22.3/x64/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Accept(0xc00040e300)
    /opt/hostedtoolcache/go/1.22.3/x64/src/internal/poll/fd_unix.go:611 +0x2ac
net.(*netFD).accept(0xc00040e300)
    /opt/hostedtoolcache/go/1.22.3/x64/src/net/fd_unix.go:172 +0x29
net.(*TCPListener).accept(0xc0002caa20)
    /opt/hostedtoolcache/go/1.22.3/x64/src/net/tcpsock_posix.go:159 +0x1e
net.(*TCPListener).Accept(0xc0002caa20)
    /opt/hostedtoolcache/go/1.22.3/x64/src/net/tcpsock.go:327 +0x30
net/http.(*Server).Serve(0xc0008340f0, {0x2caa5d0, 0xc0002caa20})
    /opt/hostedtoolcache/go/1.22.3/x64/src/net/http/server.go:3255 +0x33e
net/http.(*Server).ListenAndServe(0xc0008340f0)
    /opt/hostedtoolcache/go/1.22.3/x64/src/net/http/server.go:3184 +0x71
github.com/envoyproxy/gateway/internal/metrics.start.func1()
    /home/runner/work/gateway/gateway/internal/metrics/register.go:70 +0x17
created by github.com/envoyproxy/gateway/internal/metrics.start in goroutine 1
    /home/runner/work/gateway/gateway/internal/metrics/register.go:69 +0x1ba

goroutine 71 [syscall]:
os/signal.signal_recv()
    /opt/hostedtoolcache/go/1.22.3/x64/src/runtime/sigqueue.go:152 +0x29
os/signal.loop()
    /opt/hostedtoolcache/go/1.22.3/x64/src/os/signal/signal_unix.go:23 +0x13
created by os/signal.Notify.func1.1 in goroutine 1
    /opt/hostedtoolcache/go/1.22.3/x64/src/os/signal/signal.go:151 +0x1f

goroutine 60 [chan receive]:
sigs.k8s.io/controller-runtime/pkg/manager/signals.SetupSignalHandler.func1()
    /home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.3/pkg/manager/signals/signal.go:38 +0x27
created by sigs.k8s.io/controller-runtime/pkg/manager/signals.SetupSignalHandler in goroutine 1
    /home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.3/pkg/manager/signals/signal.go:37 +0xc5

goroutine 61 [chan receive]:
github.com/envoyproxy/gateway/internal/message.HandleSubscription[...]({{0x28e2c5b, 0x8?}, {0x28ed2ed?, 0xe?}}, 0xc0006d8240?, 0xc0018c5fb0)
    /home/runner/work/gateway/gateway/internal/message/watchutil.go:72 +0xbcd
github.com/envoyproxy/gateway/internal/provider/kubernetes.(*gatewayAPIReconciler).subscribeAndUpdateStatus.func1()
    /home/runner/work/gateway/gateway/internal/provider/kubernetes/status.go:30 +0xa8
created by github.com/envoyproxy/gateway/internal/provider/kubernetes.(*gatewayAPIReconciler).subscribeAndUpdateStatus in goroutine 1
    /home/runner/work/gateway/gateway/internal/provider/kubernetes/status.go:29 +0x7a

goroutine 62 [running]:
    goroutine running on other thread; stack unavailable
created by github.com/envoyproxy/gateway/internal/provider/kubernetes.(*gatewayAPIReconciler).subscribeAndUpdateStatus in goroutine 1
    /home/runner/work/gateway/gateway/internal/provider/kubernetes/status.go:54 +0xd1

goroutine 63 [chan receive]:
github.com/envoyproxy/gateway/internal/message.HandleSubscription[...]({{0x28e2c5b, 0x8?}, {0x28f1705?, 0x10?}}, 0xc0006d8480?, 0xc00076dfc0)
    /home/runner/work/gateway/gateway/internal/message/watchutil.go:72 +0xbcd
github.com/envoyproxy/gateway/internal/provider/kubernetes.(*gatewayAPIReconciler).subscribeAndUpdateStatus.func3()
    /home/runner/work/gateway/gateway/internal/provider/kubernetes/status.go:87 +0x86
created by github.com/envoyproxy/gateway/internal/provider/kubernetes.(*gatewayAPIReconciler).subscribeAndUpdateStatus in goroutine 1
    /home/runner/work/gateway/gateway/internal/provider/kubernetes/status.go:86 +0x12c

goroutine 64 [chan receive]:
github.com/envoyproxy/gateway/internal/message.HandleSubscription[...]({{0x28e2c5b, 0x8?}, {0x28ef2a2?, 0xf?}}, 0xc0006d85a0?, 0xc00077bfc0)
    /home/runner/work/gateway/gateway/internal/message/watchutil.go:72 +0xbcd
github.com/envoyproxy/gateway/internal/provider/kubernetes.(*gatewayAPIReconciler).subscribeAndUpdateStatus.func4()
    /home/runner/work/gateway/gateway/internal/provider/kubernetes/status.go:117 +0x86
created by github.com/envoyproxy/gateway/internal/provider/kubernetes.(*gatewayAPIReconciler).subscribeAndUpdateStatus in goroutine 1
    /home/runner/work/gateway/gateway/internal/provider/kubernetes/status.go:116 +0x185

goroutine 81 [chan receive]:
github.com/envoyproxy/gateway/internal/message.HandleSubscription[...]({{0x28e2c5b, 0x8?}, {0x28ef293?, 0xf?}}, 0xc0006d86c0?, 0xc00077dfc0)
    /home/runner/work/gateway/gateway/internal/message/watchutil.go:72 +0xbcd
github.com/envoyproxy/gateway/internal/provider/kubernetes.(*gatewayAPIReconciler).subscribeAndUpdateStatus.func5()
    /home/runner/work/gateway/gateway/internal/provider/kubernetes/status.go:149 +0x86
created by github.com/envoyproxy/gateway/internal/provider/kubernetes.(*gatewayAPIReconciler).subscribeAndUpdateStatus in goroutine 1
    /home/runner/work/gateway/gateway/internal/provider/kubernetes/status.go:148 +0x1dc

goroutine 82 [chan receive]:
github.com/envoyproxy/gateway/internal/message.HandleSubscription[...]({{0x28e2c5b, 0x8?}, {0x28ef284?, 0xf?}}, 0xc0006d8120?, 0xc00076ffc0)
    /home/runner/work/gateway/gateway/internal/message/watchutil.go:72 +0xbcd
github.com/envoyproxy/gateway/internal/provider/kubernetes.(*gatewayAPIReconciler).subscribeAndUpdateStatus.func6()
    /home/runner/work/gateway/gateway/internal/provider/kubernetes/status.go:181 +0x86
created by github.com/envoyproxy/gateway/internal/provider/kubernetes.(*gatewayAPIReconciler).subscribeAndUpdateStatus in goroutine 1
    /home/runner/work/gateway/gateway/internal/provider/kubernetes/status.go:180 +0x233

goroutine 83 [chan receive]:
github.com/envoyproxy/gateway/internal/message.HandleSubscription[...]({{0x28e2c5b, 0x8?}, {0x2901975?, 0x17?}}, 0xc0006d8900?, 0xc000779fc0)
    /home/runner/work/gateway/gateway/internal/message/watchutil.go:72 +0xbcd
github.com/envoyproxy/gateway/internal/provider/kubernetes.(*gatewayAPIReconciler).subscribeAndUpdateStatus.func7()
    /home/runner/work/gateway/gateway/internal/provider/kubernetes/status.go:213 +0x86
created by github.com/envoyproxy/gateway/internal/provider/kubernetes.(*gatewayAPIReconciler).subscribeAndUpdateStatus in goroutine 1
    /home/runner/work/gateway/gateway/internal/provider/kubernetes/status.go:212 +0x28c

goroutine 84 [chan receive]:
github.com/envoyproxy/gateway/internal/message.HandleSubscription[...]({{0x28e2c5b, 0x8?}, {0x2908797?, 0x1a?}}, 0xc0006d8a20?, 0xc000915fc0)
    /home/runner/work/gateway/gateway/internal/message/watchutil.go:72 +0xbcd
github.com/envoyproxy/gateway/internal/provider/kubernetes.(*gatewayAPIReconciler).subscribeAndUpdateStatus.func8()
    /home/runner/work/gateway/gateway/internal/provider/kubernetes/status.go:245 +0x86
created by github.com/envoyproxy/gateway/internal/provider/kubernetes.(*gatewayAPIReconciler).subscribeAndUpdateStatus in goroutine 1
    /home/runner/work/gateway/gateway/internal/provider/kubernetes/status.go:244 +0x2e5

goroutine 85 [chan receive]:
github.com/envoyproxy/gateway/internal/message.HandleSubscription[...]({{0x28e2c5b, 0x8?}, {0x290aeea?, 0x1b?}}, 0xc0006d87e0?, 0xc000777fc0)
    /home/runner/work/gateway/gateway/internal/message/watchutil.go:72 +0xbcd
github.com/envoyproxy/gateway/internal/provider/kubernetes.(*gatewayAPIReconciler).subscribeAndUpdateStatus.func9()
    /home/runner/work/gateway/gateway/internal/provider/kubernetes/status.go:277 +0x86
created by github.com/envoyproxy/gateway/internal/provider/kubernetes.(*gatewayAPIReconciler).subscribeAndUpdateStatus in goroutine 1
    /home/runner/work/gateway/gateway/internal/provider/kubernetes/status.go:276 +0x33c

goroutine 86 [chan receive]:
github.com/envoyproxy/gateway/internal/message.HandleSubscription[...]({{0x28e2c5b, 0x8?}, {0x28fd24a?, 0x15?}}, 0xc0006d8b40?, 0xc000da9fc0)
    /home/runner/work/gateway/gateway/internal/message/watchutil.go:72 +0xbcd
github.com/envoyproxy/gateway/internal/provider/kubernetes.(*gatewayAPIReconciler).subscribeAndUpdateStatus.func10()
    /home/runner/work/gateway/gateway/internal/provider/kubernetes/status.go:309 +0x86
created by github.com/envoyproxy/gateway/internal/provider/kubernetes.(*gatewayAPIReconciler).subscribeAndUpdateStatus in goroutine 1
    /home/runner/work/gateway/gateway/internal/provider/kubernetes/status.go:308 +0x393

goroutine 87 [chan receive]:
github.com/envoyproxy/gateway/internal/message.HandleSubscription[...]({{0x28e2c5b, 0x8?}, {0x290195e?, 0x17?}}, 0xc0006d8c60?, 0xc000911fc0)
    /home/runner/work/gateway/gateway/internal/message/watchutil.go:72 +0xbcd
github.com/envoyproxy/gateway/internal/provider/kubernetes.(*gatewayAPIReconciler).subscribeAndUpdateStatus.func11()
    /home/runner/work/gateway/gateway/internal/provider/kubernetes/status.go:341 +0x86
created by github.com/envoyproxy/gateway/internal/provider/kubernetes.(*gatewayAPIReconciler).subscribeAndUpdateStatus in goroutine 1
    /home/runner/work/gateway/gateway/internal/provider/kubernetes/status.go:340 +0x3ec

goroutine 88 [chan receive]:
github.com/envoyproxy/gateway/internal/message.HandleSubscription[...]({{0x28e2c5b, 0x8?}, {0x290aecf?, 0x1b?}}, 0xc0006d8d80?, 0xc000913fc0)
    /home/runner/work/gateway/gateway/internal/message/watchutil.go:72 +0xbcd
github.com/envoyproxy/gateway/internal/provider/kubernetes.(*gatewayAPIReconciler).subscribeAndUpdateStatus.func12()
    /home/runner/work/gateway/gateway/internal/provider/kubernetes/status.go:371 +0x86
created by github.com/envoyproxy/gateway/internal/provider/kubernetes.(*gatewayAPIReconciler).subscribeAndUpdateStatus in goroutine 1
    /home/runner/work/gateway/gateway/internal/provider/kubernetes/status.go:370 +0x445

goroutine 24 [chan receive]:
github.com/envoyproxy/gateway/internal/message.HandleSubscription[...].func1()
    /home/runner/work/gateway/gateway/internal/message/watchutil.go:55 +0x9a
created by github.com/envoyproxy/gateway/internal/message.HandleSubscription[...] in goroutine 82
    /home/runner/work/gateway/gateway/internal/message/watchutil.go:54 +0x125

goroutine 25 [select]:
github.com/telepresenceio/watchable.(*Map[...]).coalesce(0x2ceb340, {0x2cb49e0, 0x4589a80}, 0xc0002aa970, 0xc0006d81e0, 0xc0006d8240, 0xc00075a2d0)
    /home/runner/go/pkg/mod/github.com/telepresenceio/watchable@v0.0.0-20220726211108-9bb86f92afa7/map.go:382 +0x76c
created by github.com/telepresenceio/watchable.(*Map[...]).SubscribeSubset in goroutine 61
    /home/runner/go/pkg/mod/github.com/telepresenceio/watchable@v0.0.0-20220726211108-9bb86f92afa7/map.go:281 +0x14d

goroutine 26 [chan receive]:
github.com/envoyproxy/gateway/internal/message.HandleSubscription[...].func1()
    /home/runner/work/gateway/gateway/internal/message/watchutil.go:55 +0x9a
created by github.com/envoyproxy/gateway/internal/message.HandleSubscription[...] in goroutine 61
    /home/runner/work/gateway/gateway/internal/message/watchutil.go:54 +0x125

goroutine 27 [select]:
github.com/telepresenceio/watchable.(*Map[...]).coalesce(0x2ceaf80, {0x2cb49e0, 0x4589a80}, 0xc0002aa990, 0xc0006d8300, 0xc0006d8360, 0xc00075a360)
    /home/runner/go/pkg/mod/github.com/telepresenceio/watchable@v0.0.0-20220726211108-9bb86f92afa7/map.go:382 +0x76c
created by github.com/telepresenceio/watchable.(*Map[...]).SubscribeSubset in goroutine 62
    /home/runner/go/pkg/mod/github.com/telepresenceio/watchable@v0.0.0-20220726211108-9bb86f92afa7/map.go:281 +0x14d

goroutine 28 [chan receive]:
github.com/envoyproxy/gateway/internal/message.HandleSubscription[...].func1()
    /home/runner/work/gateway/gateway/internal/message/watchutil.go:55 +0x9a
created by github.com/envoyproxy/gateway/internal/message.HandleSubscription[...] in goroutine 62
    /home/runner/work/gateway/gateway/internal/message/watchutil.go:54 +0x125

goroutine 29 [select]:
github.com/telepresenceio/watchable.(*Map[...]).coalesce(0x2ceabc0, {0x2cb49e0, 0x4589a80}, 0xc0002aa9b0, 0xc0006d8420, 0xc0006d8480, 0xc00075a3f0)
    /home/runner/go/pkg/mod/github.com/telepresenceio/watchable@v0.0.0-20220726211108-9bb86f92afa7/map.go:382 +0x76c
created by github.com/telepresenceio/watchable.(*Map[...]).SubscribeSubset in goroutine 63
    /home/runner/go/pkg/mod/github.com/telepresenceio/watchable@v0.0.0-20220726211108-9bb86f92afa7/map.go:281 +0x14d

...
goroutine 62 [running]:
github.com/envoyproxy/gateway/internal/metrics.(*Gauge).With(0xc0007876d0, {0xc00076bd18?, 0x0?, 0x0?})
    /home/runner/work/gateway/gateway/internal/metrics/otel_metric_gauge.go:50 +0x31b
github.com/envoyproxy/gateway/internal/message.HandleSubscription[...]({{0x28e2c5b, 0x8?}, {0x28f1715?, 0x10?}}, 0xc0006d8360?, 0xc001707fc0)
    /home/runner/work/gateway/gateway/internal/message/watchutil.go:73 +0x1033
github.com/envoyproxy/gateway/internal/provider/kubernetes.(*gatewayAPIReconciler).subscribeAndUpdateStatus.func2()
    /home/runner/work/gateway/gateway/internal/provider/kubernetes/status.go:55 +0x85
created by github.com/envoyproxy/gateway/internal/provider/kubernetes.(*gatewayAPIReconciler).subscribeAndUpdateStatus in goroutine 1
    /home/runner/work/gateway/gateway/internal/provider/kubernetes/status.go:54 +0xd1

The pods were in the next state:

kubectl get pods -n envoy-gateway-system
NAME                                                     READY   STATUS        RESTARTS      AGE
backend-69fcff487f-zdj8s                                 1/1     Running       0             3m22s
envoy-envoy-gateway-system-eg-5391c79d-79fdb5d7f-qv5dr   1/2     Terminating   0             12m
envoy-envoy-gateway-system-eg-5391c79d-79fdb5d7f-wzlxh   2/2     Running       0             3m22s
envoy-gateway-7ff7cffb6c-w5f7p                           1/1     Running       1 (11s ago)   3m22s

It helped to manually remove gateway and envoy-gateway.

kubectl get pods -n envoy-gateway-system
NAME                                                     READY   STATUS    RESTARTS   AGE
backend-69fcff487f-zdj8s                                 1/1     Running   0          6m16s
envoy-envoy-gateway-system-eg-5391c79d-79fdb5d7f-wzlxh   2/2     Running   0          6m16s
envoy-gateway-7ff7cffb6c-pwsbm                           1/1     Running   0          2m5s

Expected behaviour:

better error handling, and restart of the pod in case of issues.

gecube commented 1 month ago

again terminating:

$ kubectl get pods -n envoy-gateway-system                                                   
NAME                                                      READY   STATUS        RESTARTS   AGE
backend-69fcff487f-zdj8s                                  1/1     Running       0          9m12s
envoy-envoy-gateway-system-eg-5391c79d-565cc65f6b-h4g2s   2/2     Running       0          22s
envoy-envoy-gateway-system-eg-5391c79d-79fdb5d7f-wzlxh    2/2     Terminating   0          9m12s
envoy-gateway-7ff7cffb6c-pwsbm                            1/1     Running       0          5m1s
gecube commented 1 month ago

$ kubectl describe pod -n  envoy-gateway-system envoy-envoy-gateway-system-eg-5391c79d-79fdb5d7f-wzlxh

Events:
  Type     Reason     Age               From               Message
  ----     ------     ----              ----               -------
  Normal   Scheduled  10m               default-scheduler  Successfully assigned envoy-gateway-system/envoy-envoy-gateway-system-eg-5391c79d-79fdb5d7f-wzlxh to gke-dev2-pool-1-d6247585-5pi9
  Normal   Pulling    10m               kubelet            Pulling image "envoyproxy/envoy:distroless-dev"
  Normal   Pulled     7m31s             kubelet            Successfully pulled image "envoyproxy/envoy:distroless-dev" in 2.099736442s (2m34.66287988s including waiting)
  Normal   Created    7m31s             kubelet            Created container envoy
  Normal   Started    7m31s             kubelet            Started container envoy
  Normal   Pulling    7m31s             kubelet            Pulling image "docker.io/envoyproxy/gateway-dev:latest"
  Normal   Pulled     7m1s              kubelet            Successfully pulled image "docker.io/envoyproxy/gateway-dev:latest" in 248.719122ms (30.285645955s including waiting)
  Normal   Created    7m1s              kubelet            Created container shutdown-manager
  Normal   Started    7m                kubelet            Started container shutdown-manager
  Normal   Killing    75s               kubelet            Stopping container envoy
  Normal   Killing    75s               kubelet            Stopping container shutdown-manager
  Warning  Unhealthy  6s (x7 over 66s)  kubelet            Readiness probe failed: HTTP probe failed with statuscode: 503```
gecube commented 1 month ago

may be the same as #3220

gecube commented 1 month ago

target cluster is GKE

$ kubectl version
Client Version: v1.30.1
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.27.13-gke.1000000
WARNING: version difference between client (1.30) and server (1.27) exceeds the supported minor version skew of +/-1

the envoy is taken from the chart and the gateway api manifests are taken from like here https://gateway-api.sigs.k8s.io/guides/#install-experimental-channel

arkodg commented 1 month ago

is GKE cluster with autopilot enabled ? is GKE also trying to simultaneously install gateway-api manifests ? https://cloud.google.com/kubernetes-engine/docs/how-to/deploying-gateways

gecube commented 1 month ago

@arkodg Good day! No, GKE without autopilot. No, GKE is not trying install any additional controllers. It is very interesting, that now it is working stable. So it was transient issue, but it was very... khm... unconvenient and uncomfortable to experience it.

arkodg commented 1 month ago

thanks for the update @gecube

sounds like we need to revisit the livenessProbe and readinessProbe defaults https://github.com/envoyproxy/gateway/blob/01dd7b11e7768623e801d82d80dc170a8e0572e3/internal/infrastructure/kubernetes/proxy/testdata/deployments/default.yaml#L255

cc @owenhaynes

owenhaynes commented 1 month ago

Yeah this looks the similar to the issue #3220 i never saw errors in the gateway pod like shown here. From my issue i think its related to the image pulling the enovy image taking longer to pull and start compared to the shutdown pod.

I did increase the all the probes failureThreshold of the shutdown container using a patch to fix this issue.

arkodg commented 1 month ago

thanks @owenhaynes, wanna share your patch here ? if many hit this, maybe we should just commit that in / revisit the defaults

gecube commented 1 month ago

@owenhaynes Hi! I don't think it is related to image pulling, as images were already present on the nodes (?). @arkodg It is relatively difficult to reproduce. I will need to make a brand new environment and check again.

owenhaynes commented 1 month ago

@gecube yeah in my instance it was when nodes have just been started from a fresh node instance so no images are prepulled

shutdown manager patch @arkodg

  provider:
    type: "Kubernetes"
    kubernetes:
      envoyDeployment:
        patch:
          type: StrategicMerge
          value:
            spec:
              template:
                spec:
                  containers:
                  - name: shutdown-manager
                    livenessProbe:
                      failureThreshold: 10
                    readinessProbe:
                      failureThreshold: 10
arkodg commented 2 weeks ago

moving this to v1.1, lets relax the probe values a bit more, as I'm hearing many more users complain about GKE installs being flaky