envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
24.98k stars 4.81k forks source link

ODCDS is unable to work with EDS when handling multiple upstream endpoints #36373

Open whutwhu opened 1 month ago

whutwhu commented 1 month ago

The first request from an upstream endpoint triggers our on-demand xDS flow (VHDS → CDS → EDS), subscribing to the necessary resources. However, when multiple upstream endpoints are involved, only the first one functions correctly. For subsequent endpoints, EDS is not initiated, blocking the retrieval of the clusterLoadAssignment resource for those clusters.

The issue arises in OdCdsApiImpl::updateOnDemand:

  1. Initially, the status is StartStatus::NotStarted.
  2. NewGrpcMuxImpl::addWatch is called, adding the cluster to the watch and subscription interest lists.
  3. The first endpoint successfully triggers the EDS flow.
  4. The status changes to StartStatus::InitialFetchDone.
  5. For subsequent endpoints, GrpcSubscriptionImpl::requestOnDemandUpdate is triggered with StartStatus::InitialFetchDone, but it only updates the subscription interest list and doesn't update the watch interest list.
  6. As a result, the cluster doesn’t trigger EDS for these additional endpoints.

Instead of calling GrpcSubscriptionImpl::requestOnDemandUpdate after the first CDS request, GrpcSubscriptionImpl::updateResourceInterest should be called to ensure NewGrpcMuxImpl::addWatch updates the watch interest list and triggers the EDS flow.

I would appreciate feedback from the community. We are currently using Envoy version 1.25.7, but it appears this issue may also persist in the most recent code.

kyessenov commented 1 month ago

CC @adisuissa I am not sure I fully understand what multiple upstream endpoints mean. I thought ODCDS/VHDS operate on downstream requests, not upstream endpoints.

whutwhu commented 1 month ago

Hi @kyessenov This includes an explanation of downstream and upstream in the context of Envoy: https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/intro/terminology

Regarding "multiple upstream endpoints," we can only subscribe to the full resource from one service. However, if multiple services need to be accessed, only the first one works without any issues.

whutwhu commented 1 month ago

@kyessenov @adisuissa Could you provide any guidance or share concerns and suggestions? If there are no issues, I’ll go ahead and submit a PR for your review. Let me know if anything comes up.

whutwhu commented 4 weeks ago

I've opened a PR: https://github.com/envoyproxy/envoy/pull/36527