envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
24.82k stars 4.77k forks source link

AWS STS - STS cluster is destroyed on CDS update #35022

Closed michaelfinch closed 2 months ago

michaelfinch commented 3 months ago

When attempting to follow method 3 here https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/aws_request_signing_filter#credentials, the STS cluster created by the filter is created as a dynamic cluster. If delta xDS is not being used, this dynamic cluster will be deleted on the next CDS update the envoy receives, which will cause AWS request signing to fail.

Repro steps:

  1. Set envoy.reloadable_features.use_http_client_to_fetch_aws_credentials to true.
  2. Confirm that AWS request signing works as expected.
  3. Without using delta xDS, deliver a CDS update to the envoy.
  4. Confirm that AWS request signing is now broken.

Here are the debug logs seen when a CDS update is received

[2024-07-02 17:38:06.614][15][debug][init] [external/envoy/source/common/init/watcher_impl.cc:31] init manager Cluster sts_token_service_internal destroyed
[2024-07-02 17:38:06.614][15][debug][upstream] [external/envoy/source/common/upstream/cluster_manager_impl.cc:859] removing cluster sts_token_service_internal
[2024-07-02 17:38:06.614][15][debug][upstream] [external/envoy/source/common/upstream/cluster_manager_impl.cc:863] removing TLS cluster sts_token_service_internal
[2024-07-02 17:38:06.614][15][debug][upstream] [external/envoy/source/common/upstream/cds_api_helper.cc:66] cds: remove cluster 'sts_token_service_internal'

Here is where the STS cluster is created https://github.com/envoyproxy/envoy/blob/main/source/extensions/common/aws/credentials_provider_impl.cc#L150. The function is named createInternalClusterStatic, but I confirmed in config dump that the cluster is actually created as a dynamic cluster. Is there a way to create a static cluster that won't get wiped out by CDS updates?

htuch commented 3 months ago

@derekargueta @suniltheta @mattklein123 @marcomagdy @nbaws

nbaws commented 3 months ago

will grab this one

nbaws commented 2 months ago

@michaelfinch #35062 will address this issue. Thank you for reporting it :)

michaelfinch commented 2 months ago

Thank you for the quick turnaround!

michaelfinch commented 2 months ago

Addressed by https://github.com/envoyproxy/envoy/pull/35062