istio / istio

Connect, secure, control, and observe services.
https://istio.io
Apache License 2.0
35.94k stars 7.76k forks source link

proxy has trouble discovering gRPC endpoints after server pods restart #49391

Closed owais closed 5 months ago

owais commented 8 months ago

Is this the right place to submit this?

Bug Description

We have two services that communicate over gRPC. The server is deployed as a statefulset. The services are able to communicate with each other fine but they lose connectivity as soon as the server pod(s) are restarted.

My very naive guess is that when the port app protocol is set to grpc, client istio proxy fails to drop the local connection between itself and the app container even when the remote server pod is deleted or become unhealthy. This probably results in the connection not resolving DNS again and trying to keep connecting to the old IP.

The gRPC client sees the issue as a timeout instead of dropped connection with errors like

io.grpc.StatusRuntimeException: UNAVAILABLE: upstream request timeout

We are running istio 1.18.7.

Version

➜ istioctl version
client version: 1.18.5
control plane version: 1.18.7
data plane version: 1.18.5 (55 proxies), 1.18.7 (389 proxies)

The services in question are running 1.18.7

Reproducible Example

https://github.com/owais/istio-grpc-headless-test

owais commented 8 months ago

I was trying to create a ServiceEntry for the server to see if it impacts anything and while it did not fix the issue, deploying the ServiceEntry did seemingly trigger something that resulted in the proxy picking up new pod IPs although this worked only once. I tried deleting and re-creating the ServiceEntry and it did not trigger the "sync" again so was probably a coincidence.

owais commented 8 months ago

Another interesting finding. I installed 1.20.3 control plane to test the latest version. I onboarded the pods to the new control plane and ran a quick test and it worked as expected. Unhealthy/old endpoints got removed from the proxy and once new pods came up, new endpoints were added.

gRPC client also threw errors that made much more sense while the pods were down:

Failed to resolve name. status=Status{code=UNAVAILABLE, description=Unable to resolve host server-0.headless-server.ns.svc.cluster.local, cause=java.lang.RuntimeException: java.net.UnknownHostException: server-0.headless-server.ns.svc.cluster.local: Name does not resolve

instead of the timeout errors it used to through.

However, subsequent tests again failed and reverted back to the same problem as on 1.18.7.

So the problem persists both on 1.18.7 and 1.20.3 but istio does the right thing occasionally seemingly randomly.

hzxuzhonghu commented 8 months ago
  • As new server pods come up, the client proxy does NOT discover the new pod IPs as endpoints.
  • As a result it keeps trying to connect to old IPs which obviously doesn't work.

How do you see it still connect to old ips? Did you make use of istio local dns proxy?

hzxuzhonghu commented 8 months ago

I find that nds generator relies on pushContext, in this case endpointslice changes, but we only trigger non full push, so we donot update the pushContext, then nds still returns the old stale pod ips.

hzxuzhonghu commented 8 months ago

But in 1.18.7, we do trigger full push https://github.com/istio/istio/blob/1.18.7/pilot/pkg/serviceregistry/kube/controller/endpointcontroller.go#L59-L65. I donot check nds generator of 1.18, maybe different.

so first want to know if you enabled istio dns proxy @owais

owais commented 8 months ago

How do you see it still connect to old ips? Did you make use of istio local dns proxy?

I see the old IPs in endpoint list but more importantly, I see them in istio access logs which is a clear indication. I just tested with my reproducible example shared above and this is the state of the client proxy after server pods were restarted:

  1. server pods have following IPs:

    image
  2. client pod has following endpoints (old IPs from deleted pods)

    image
  3. client app is throwing errors:

    image
  4. client proxy access logs show it is trying to connect to old IPs from deleted pods:

image

so first want to know if you enabled istio dns proxy @owais

I do not think so but let me see what this is and if it is enabled for our setup.

owais commented 8 months ago

I confirmed that we do not have DNS proxy enabled. There is no reference to DNS proxy config in our helm values and inspecting istiod config maps shows only one place (istio-sidecar-injector) where ISTIO_META_DNS_CAPTURE is mentioned and it is set to false

image
hzxuzhonghu commented 8 months ago

Thanks, will take a closer look

hzxuzhonghu commented 8 months ago

cluster outbound|9090||istio-grpc-test-server.istio-test.svc.cluster.local type is ORIGINAL_DST, i am not sure why can we push eds to client.

hzxuzhonghu commented 8 months ago

I find if we change remove one ep from https://github.com/owais/istio-grpc-headless-test/blob/main/client.yaml#L39, then we only get two endpoints with istioctl proxy-config ep for the grpc server. If we make the client never access server, then we get no ep for grpc-test server.

@ramaraochavali Do you know the magic of LbPolicy= Cluster_CLUSTER_PROVIDED

hzxuzhonghu commented 8 months ago

@owais With sidecar, the connection from grpc client will connected with envoy, so when the server instance restarted the connection is there. Different with no sidecar, which will reconnect with the new instance.

Here the headless service use ORIGINAL_DST cluster, and i guess it depends on the original dst address to generate cluster loader assignment in envoy, that is why we can get endpoint for outbound|9090||istio-grpc-test-server.istio-test.svc.cluster.local. And if we make the client does not access the service, only cluster is there, but no endpoint.

In order to workaround this, you can recreate the connection when if r, err := c.SayHello(ctx, &pb.HelloRequest{Name: server_address}); err != nil { returns the conext dealine exceed error

ramaraochavali commented 8 months ago

Original DST clusters endpoints get built as you access different IPs via the cluster. Unused hosts are cleaned up at periodic intervals.

This is a known problem with gRPC/Http clients that may have stale addresses because of connection pooling in Envoy https://github.com/envoyproxy/envoy/issues/19458. For TCP, for each downstream connection there will only be equivalent upstream connection so the problem is not there. When upstream connection is closed, the downstream connection is automatically closed.

owais commented 8 months ago

If we make the client never access server, then we get no ep for grpc-test server.

Right. Also, once the system goes into state, sending a regular curl request "discovers" the the new pod IP and it becomes visible in endpoints. HTTP/2 requests hwoever still continue to go to old stale IPs.

owais commented 8 months ago

In order to workaround this, you can recreate the connection when if r, err := c.SayHello(ctx, &pb.HelloRequest{Name: server_address}); err != nil { returns the conext dealine exceed error

Thanks, this was my hunch as well but unfortunately in our case, this isn't a practical solution. We expect to rollout istio to thousands of services and app teams expected the mesh to be more or less transparent to the applications. Moreoever, we have a bunch of 3rd party services that we'd have to fork to make such changes to their code.

Our workaround for now is to mark these ports as TCP which makes istio work as expected but then we lose all the nice L7 traffic management and http/grpc level observability.

owais commented 8 months ago

Thanks for the explanation @ramaraochavali and @hzxuzhonghu. This was my hunch as well and while the explanation helps make sense of what is going on, as an end user, this still feels like a bug in istio looking in from the outside. Istio claims to support gRPC but this actually would break a lot of gRPC services.

Maybe the bug is technically in envoy though if it does not remove lost connections from the pool it holds.

Until envoy mitigates this somehow, would it make more sense of istio to configure envoy differently to not trigger this issue? At the very least, I think istio should document that gprc/http2 + headless service is NOT supported and workaround is to use TCP for these cases. Perhaps something in the FAQ like we used to have for services like ES/zookeeper would be good enough.

hzxuzhonghu commented 8 months ago

It could introduce other regressions if we make gprc/http2 + headless service cluster type as eds rather than original_dst.

owais commented 8 months ago

I understand. I'll leave the technical decision making to the maintainers.

While it'd be nice to see grpc work with headless services, I think it is much more important to call it out in docs until it is fixed. The docs should mention this use-case is not supported by istio and explain what the workaround is (to set appProtocol=tcp).

As a side note, it'd be amazing to have an istioctl command like analyze that would look at your deployed services and point out services that might not work with istio and what to do to fix them.

ramaraochavali commented 8 months ago

as an end user, this still feels like a bug in istio looking in from the outside. Istio claims to support gRPC but this actually would break most gRPC services.

I agree it is a bug and needs to be fixed in Envoy. Another work around is to create Service Entry for each endpoint with pod selector as labels that k8s gives for each headless pod. Then it automatically becomes EDS cluster.

hzxuzhonghu commented 8 months ago

Alternative: if we generate a cluster per pod, it will not need to create SE for each pod. And the behavior is same.

owais commented 8 months ago

I'm pretty sure I tried creating Service Entries as well and it did not work but I spent very little time with it so I might not have tried everything or even deployed them correctly. Will try to test SE tomorrow if I find some cycles.

istio-policy-bot commented 5 months ago

🚧 This issue or pull request has been closed due to not having had activity from an Istio team member since 2024-02-22. If you feel this issue or pull request deserves attention, please reopen the issue. Please see this wiki page for more information. Thank you for your contributions.

Created by the issue and PR lifecycle manager.