Issues while migrating to proxy-protocol pod annotation

istio / istio

Connect, secure, control, and observe services.

https://istio.io

Apache License 2.0

35.94k stars 7.76k forks source link

Issues while migrating to proxy-protocol pod annotation #49865

Closed himanshujaindev closed 7 months ago

himanshujaindev commented 8 months ago

Is this the right place to submit this?

[X] This is not a security vulnerability or a crashing bug
[X] This is not a question about how to use Istio

Bug Description

To avoid downtime, we need to first move to pod annotation and then behind the scenes delete the envoyfilter.

When both are used at the same time, the below error is seen in ingressgateway pod (when debug logs are enabled)

{"level":"debug","time":"2024-03-12T11:49:14.458353Z","scope":"envoy filter","msg":"failed to read proxy protocol (exceed max v1 header len)","caller":"external/envoy/source/extensions/filters/listener/proxy_protocol/proxy_protocol.cc:519","thread":22}

Reference: https://istio.io/latest/docs/ops/common-problems/upgrade-issues/#use-gateway-topology-to-enable-proxy-protocol-on-the-ingress-gateways

Version

$ istioctl version
client version: 1.20.3
control plane version: 1.20.3
data plane version: 1.20.3 (10 proxies)

Additional Information

No response

howardjohn commented 8 months ago

this seems expected, you are configuring the filter twice. you should only configure it once. You can use label selector on the envoy filter to gracefully move over and have only one config at a time

himanshujaindev commented 8 months ago

@howardjohn - We notice 1-2 seconds of downtime during this transition. How can we avoid that? When we attach a label selector to point to a dummy app, and simultaneously, apply pod annotation, downtime is observed.

kind: EnvoyFilter
metadata:
  name: proxy-protocol
  namespace: istio-system
spec:
  workloadSelector:
    labels:
      istio: ingressgateway-unused

howardjohn commented 8 months ago

have 2 deployments one with the annotation and one with the label. don't change running pods

himanshujaindev commented 8 months ago

@howardjohn - If we want to use the EnvoyFilter post 1.20 version as well, will that be an issue?

howardjohn commented 8 months ago

Why would you want to use the Envoyfilter?

either way, you need exactly 1 config per pod

himanshujaindev commented 8 months ago

Due to downtime seen during the transition of Envoyfilter to pod annotation, we cannot move to pod annotation.

We have tried canary deployment, and with this method also we see downtime. https://github.com/istio/istio/issues/46052

howardjohn commented 8 months ago

Can you post the steps you followed? It does not need to be complex.

Label all the gateway pods (by changing the deployment) with use-envoyfilter-proxy-protocol: true or something
Once that rolls out, change the envoy filter to select use-envoyfilter-proxy-protocol: true. This will be zero downtime
Update the deployment, removing the label and adding the annotation

hzxuzhonghu commented 8 months ago

https://github.com/istio/istio/issues/49764

himanshujaindev commented 7 months ago

@howardjohn - The steps mentioned do not cause downtime. Thank you.