envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
25k stars 4.81k forks source link

Supporting "Delegated Identity API" from SPIRE via SDS #19756

Open dastbe opened 2 years ago

dastbe commented 2 years ago

Title: Supporting "Delegated Identity API" from SPIRE via SDS

Description:

Hi! Recently, SPIRE added support for a Delegated Identity API, which adds support for a trusted workload to retrieve certificates on-behalf-of other workloads on the host. From looking at the current XDS integration with SPIRE the only way I can see to make delegated identity work with a shared host-level proxy would be to have clusters per client-identity x destination and working that backwards through client-aware routes, etc.. While that in theory sounds possible, we don't want to have an online dependency between the configuration synchronized to Envoy and the current state of workloads deployed to a host. We'd also be worried about the cardinality of client-identity x destination over time.

Assuming there's not something obvious we're missing, do yall have any thoughts around how Envoy could be extended to suppor this use case? Or somebody I should bug more synchronously about this?

[optional Relevant Links:]

[0] Delegated Identity API [1] Example XDS integration with SPIRE as-is

mattklein123 commented 2 years ago

I don't think I fully understand the use case. Is the idea here that you have an Envoy level workload that is trusted to get certs for a particular client and you would want to apply that during client routing?

I vaguely feel like there is probably some way to handle this via dynamic forward proxy and/or loading custom TLS contexts based on the request, but I would need to look into this more.

cc @lizan @qiwzhang @kyessenov who might have some ideas or other people to ping.

dastbe commented 2 years ago

Is the idea here that you have an Envoy level workload that is trusted to get certs for a particular client and you would want to apply that during client routing?

Not sure what you mean by Envoy level, I would phrase it as "...have an Envoy that is proxying outbound requests on behalf of multiple clients". Given we have a way of identifying the client workload from the inbound connection, we would like to select the client cert dynamically based on the client identity.

mattklein123 commented 2 years ago

we would like to select the client cert dynamically based on the client identity.

Yeah OK that makes sense. My intuition remains the same which is that somehow this could be built into the DFP mechanism where we potentially fetch a cert in the same way we do DNS resolutions. Others may have other ideas.

kyessenov commented 2 years ago

We have a similar request to dynamically select a client TLS context based on per-connection metadata to avoid the explosion of "client identity x destination" that was mentioned. There is a precedent of using endpoint labels (https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/cluster/v3/cluster.proto#envoy-v3-api-msg-config-cluster-v3-cluster-transportsocketmatch). In our case, we'd want dynamic metadata provided by a custom filter to drive the selection.

dastbe commented 2 years ago

@kyessenov do you have need for selection from a fixed set of certs or something fully dynamic? one of the caveats is that we need to be able to do the equivalent SDS templating ala:

tls_context:
  common_tls_context:
    tls_certificate_sds_secret_configs:
      - name: "spiffe://${client-identity}"
      sds_config:
        ...

where client-identity is extracted from the inbound connection, which I think is why @mattklein123 suggested DFP. I'll take a look at that and see if there's a sensical extension.

kyessenov commented 2 years ago

The set of certs is dynamic based on the set of workloads that Envoy forwards traffic on behalf of. We're using service accounts, so it's not excessively dynamic. Instead of DFP, we rely on ORIGINAL_DST cluster in an internal listener with a custom metadata populated destination. Without the selection, we need to have as many listeners as there are service accounts each with its own ORIGINAL_DST cluster.

We would prefer that clusters do not drain connections when the set of selection certs grows. The templated SDS secret name sounds appropriate, and it also needs an "if" condition.

lambdai commented 2 years ago

In addition to @kyessenov 's comment,

We are less dynamic because the number of unique tls_context is not concerning. Our concern is the flexibility of matching mechanism that choosing the best known tls_context.

@dastbe My impression is that it's either impossible or not economic to enumerate client-identity. And then you hit the match criteria as our case.

mattklein123 commented 2 years ago

I would also point you to https://github.com/envoyproxy/envoy/pull/18723 which is dynamic loading of clusters. We do similar dynamic loading of scoped routes on the on_demand filter. I think one other option is to somehow have a filter that can dynamically load a TLS context over SDS and then make that available to further filters via matching.