spiffe / spire

The SPIFFE Runtime Environment
https://spiffe.io
Apache License 2.0
1.77k stars 467 forks source link

RFC: Allowing Delegates to still use SPIRE Agent attestation #5019

Closed bleggett closed 3 weeks ago

bleggett commented 5 months ago

In Istio, with the ambient mode we've added, we now have a node proxy written in Rust that will proxy traffic for workloads on that node.

This node proxy is obviously running in a different cgroup than the actual workloads it is privileged to act On-Behalf-Of (OBO).

The DelegateIdentity API is the best fit for this - the node proxy can attest the workloads it is acting OBO, the SPIRE Agent can attest the node proxy itself, and the node proxy can shunt the attested workload properties over to the SPIRE agent via the admin socket gRPC API. If that set of workload properties matches a workload registration, we get a cert back.

(I am excluding node attestation entirely from this issue, and assuming that is 100% the provenance of the SPIRE Agent - I am only talking about workload attestation)

That's all good.

The problem there is that as a delegate, we are now responsible for 100% of the workload attestation - the SPIRE Agent no longer is, and will in fact be unable to attest the workload the authorized delegate is acting OBO.

This means that our Rust authorized delegate either has to reimplement all the attestor plugins the SPIRE Agent supports (k8s, sigstore, etc etc) in a native library as a kind of parallel implementation to SPIRE Agent's Go plugin framework, or just reimplement a subset of them (also as a library/parallel impl) - we effectively lose the ability to lean on SPIRE Agent's pluggable workload attestation framework once we begin using the DelegateIdentity API.

This makes some amount of sense from a security perspective, otherwise you end up with a circular (somewhat pointless) trust model:

  1. Why attest the workload behind the delegated authority at all?
  2. Because you don't trust the delegated authority?
  3. But who would tell you what workload to attest?
  4. The delegated authority.

But the problem is this gets us into a spot where we cannot use the SPIRE Agent's built in workload attestation at all - we would actually like to use SPIRE Agent's attestation framework to validate the workload behind the delegate, before giving the delegate the cert. That way, we don't end up having to reimplement the entire workload attestation framework in SPIRE Agent.

We would be interested in adding a solution to this ourselves in SPIRE, but want to ask:

tl;dr the basic problem is that the SPIRE DelegateIdentity API requires trusted delegates to effectively re-implement their own workload attestation stack which roughly parallels the one SPIRE Agent already ships, and we would rather not do that and use the SPIRE Agent workload attestation stack directly via gRPC, rather than build our own or write our own workload attestation libraries.

youngnick commented 5 months ago

Cilium's use of the DelegatedIdentity API has basically the same concerns, so I'm also interested here.

bleggett commented 5 months ago

I am going to take a stab modifying the plugins to support this shortly, to see how much churn it might introduce.

kfox1111 commented 4 months ago

Was thinking about this some and discussing it on slack...

What if we modified the regular unix socket workload api, to accept an optional pidfd being passed. If a pidfd is passed, and the client attached to the workload api socket is in the delegation list, use the pid associated with the pidfd for validation rather then the pid of the delegate. Then I think all the rest of the logic would work out?

rturner3 commented 4 months ago

The maintainers have been discussing this proposal and will provide some feedback after we've had a chance to think about some of the security implications.

amartinezfayo commented 4 months ago

Thank you @bleggett for opening this issue. While we were discussing this today in our maintainers sync, we thought that it would be a good idea if you can join one instance of the SIG-SPIRE meeting to have a discussion about this. We have a lot of questions and it may be better to go through them in a meeting, where you could present your proposal and we can provide feedback. Would that work for you?

bleggett commented 4 months ago

Thank you @bleggett for opening this issue. While we were discussing this today in our maintainers sync, we thought that it would be a good idea if you can join one instance of the SIG-SPIRE meeting to have a discussion about this. We have a lot of questions and it may be better to go through them in a meeting, where you could present your proposal and we can provide feedback. Would that work for you?

Yep, that makes sense, and I'm getting freed up a bit in a few days and this is next in my queue, so it's good timing.

I will shoot for the one next week and try to have a doc ready.

evan2645 commented 4 months ago

Thank you @bleggett , that would be awesome. Please drop a note in the SPIRE slack channel if/when you're ready to present, and ping @dfeldman so we can be sure the right folks are on the call and you're added to the agenda 🙏

bleggett commented 4 months ago

https://docs.google.com/document/d/1A1oQHuR6z3bvQtXN17r2EwBr5lazGGPbUPkxoURAAh4/edit

doc (mostly a codification of above) that I intend to run thru there

bleggett commented 4 months ago

SIG-SPIRE discussion outcomes here: https://docs.google.com/document/d/1A1oQHuR6z3bvQtXN17r2EwBr5lazGGPbUPkxoURAAh4/edit#heading=h.to9i1s83kgpn

I think we are all leaning towards pid as the most portable option, provided we can be explicit about the responsibilities of the delegate around validating pid consistency (SPIRE Agent should already take care of this within its boundary).

bleggett commented 4 months ago

I have started hacking at this and since the API changes are probably the most controversial bit, I threw up a WIP PR for those: https://github.com/spiffe/spire-api-sdk/pull/58

looking for feedback/opinions there if people have any.

bleggett commented 3 weeks ago

This is merged along with doc update, so I am closing this as complete - feel free to reopen or raise issues if there are followups.