envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
24.82k stars 4.77k forks source link

Generic xDS resource manager #24773

Open adisuissa opened 1 year ago

adisuissa commented 1 year ago

Title: Generic xDS resource manager

Description: Currently components that use xDS resources directly interact with the subscription messages, and in part need to adhere to the xDS-protocol and the transport-layer. This could lead to inconsistencies in the ways that different xDS services behave (for example, #13009, #16350, #12061).

We propose xDS-Manager, that will be the entry point for xDS-resource subscriptions that will:

The design doc can be found here.

This proposal may address or improve the following issues (among others):

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

github-actions[bot] commented 1 year ago

This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions.

adisuissa commented 1 year ago

reopening an removing stale.

adisuissa commented 1 year ago

Current status: need to estimate the memory requirements when adding the cache layer for all xDS resources.

markdroth commented 1 year ago

I would not expect this approach to significantly increase memory usage. Keep in mind that Envoy must already be storing most of this data, since it needs it for the admin interface. I think it's just currently storing the data in a decentralized way (in each data plane component) instead of storing it only once in the xDS transport protocol cache. Given that, I would expect this approach would actually reduce memory usage a bit, because problems like #13009 mean that it is currently storing multiple copies of the EDS data, and that would be eliminated with this centralized cache.

kyessenov commented 1 year ago

The stress test would be something like an inline Wasm binary. I agree that a central config store should help with memory long term, but may add overhead initially.

adisuissa commented 1 year ago

Specifically for EDS, the issue is that Envoy currently doesn't keep the data, but uses it to create an instance of a mutable cluster object that each thread has its own copy of. If this stays as is, I don't think there's a way other than keeping the original EDS config, which will add require more memory.

I think that at the end of this work, the current config-providers function will be replaced from storing the resources (will be in the cache) to keeping a "consistent-view" that a worker-threads need.

markdroth commented 1 year ago

Seems like it should be possible to refactor the EDS code to keep the resource state (in whatever form we use for it) as a shared_ptr<> that is shared across the threads, to avoid storing it multiple times.