confidential-containers / trustee

Attestation and Secret Delivery Components
Apache License 2.0
51 stars 77 forks source link

Extend KBS to provide the resources required to create an encrypted overlay network #396

Open cclaudio opened 1 month ago

cclaudio commented 1 month ago

This is a follow-up on the encrypted overlay network support we proposed in the May/09/2024 meeting to secure data in-transit between CoCo PODs. We'd like to use this issue to discuss the plugin interface design and which implementation option seems more appropriate.

Background

The overlay network would be created using Nebula.

In Nebula, a node can join the overlay network only if it provides a valid credential generated by the Nebula Certificate Authority (CA); the node can request a credential to the Nebula CA by providing the IP address and the name it wants to have in the Nebula overlay network. Later, the node can use the returned certificate and key to join the Nebula overlay network.

For example, in a deployment with 10 CoCo PODs, the Nebula overlay network would have 10 CoCo PODs and one Nebula Lighthouse (assumed to already exist). Each POD would have two network interfaces one created by kubernetes and another one created by Nebula. The traffic sent through the Nebula interface is automatically encrypted by Nebula.

Proposed design related to KBS

Plugin interface requirements

Implementation options for the plugin interface

  1. Re-use the existing KBS resource URI /kbs/v0/resource/<repository>/<type>/<tag>

    Pros: May have less duplicated code. Re-use of the KBS resource URI Cons: It may look like a hack. The plugin repository may need to be reserved.

    The get-resource() function would do the RCAR dance as necessary and then if repository=plugin the other two fields would actually be interpreted as /kbs/v0/resource/plugin/<plugin_name>/<plugin_action>. The would be forwarded to the appropriate . The plugin interface could be implemented as RepositoryConfig::Plugin, but for that we would need to properly select between Plugin or LocalFs when a KBS get-resource call is received. As a RepositoryConfig::Plugin, it could be initialized in the same function as LocalFs.

    If the requires multiple parameters:

    • 1a. we could use HTTP query string(question mark symbol), to expand the , e.g. /kbs/v0/resource/plugin/<plugin_name>/<plugin_action>?ip=10.10.10.10&name=pod1. If that is not supported by the KBS protocol v0, we could try to replace the ? by something not standard such as ;.
    • 1b. If the option "1a" does not work, we could extend the KBS protocol v0 to support HTTP query string and then increase the protocol version to something like v0.1.
  2. Add a new URI for plugins e.g. /kbs/v0.1/plugin/<plugin_name>/<plugin_action>

    Pros: Have a proper URI for plugins. Plugin workflow is better isolated from get-resource. Cons: The KBS protocol version may need to be increased to something like v0.1. May have some duplicated code compared to get-resource

    The plugin URI is not supposed to change the KBS state, so it would be accessible through the HTTP GET method The new plugin-handler() function would do the RCAR as needed (similar to the get-resource) and then it would forward the to the proper . The plugin interface would have its own PluginConfig enum for plugin initialization rather than a new RepositoryConfig enum value. Additional parameters for could be provided by using HTTP Query String.

  3. Other. Any thoughts?

mkulke commented 1 month ago

Thanks for the writeup, the plugin architecture sounds intriguing and something that could evolve into a notion of a "virtual resource" as an abstraction over all sort of confidential resources. At the moment, pragmatically, such a resource simply maps 1:1 to a file on a local fs, without much regard to (distributed) state management yet.

re 3) Since this is rather invasive change and probably not something that could be introduced in the short term, have you considered implementing a specific API service to generate those ad-hoc secrets? It could be enlightened to support KBS attestation-tokens (by using KBS code as a library) to support remote attestation via RCAR.

Xynnn007 commented 1 month ago

This is an interesting point. Supporting Nebula CA seems to represent a broader need for KBS, that is, after a successful RCAR protocol, the trustee needs to return a certificate to the client. The endorsement of this certificate comes from the TCB of user/trustee. Then Nebula could use the certificate to work as key-exchange/TLS cert to encrypt data flow between pods. For other scenarios, for example, this cert could be used as a "client identity" to access resources.

fitzthum commented 1 month ago

It seems likely that we'll need to support more than just the localfs backend in the longterm. Currently we allow people to swap out localfs for something else at build time, but nothing else is supported and you can only have one backend. We will probably want some framework for supporting different types of resources at the same time. As @cclaudio suggests we could also extend the KBS protocol to cover things other than resources, but I think it's probably easiest/more scalable to treat everything as a resource.

If we can settle on a framework, we could use that as a basis for building PKI. I'm not sure if Nebula will be able to piggy-back on a generic PKI implementation or if it would still need its own plugin.

portersrc commented 1 month ago

xynnn007, "For other scenarios, for example, this cert could be used as a "client identity" to access resources."

(We had a similar train of thought when looking at this; could be appealing, but we've tabled it for now.)

mkulke, "re 3) ... Implementing a specific API service to generate those ad-hoc secrets ... enlightened to support remote attestation via RCAR."

Can you clarify a bit here? This looks non-trivial to me, at least compared with cclaudio's extension of the KBS resource structs and functions. At first I imagined you meant that the same token from the KBS could be reused to fetch more resources from this new nebula service; but maybe you mean that this new nebula service would just have similar RCAR facilities (and a new session and token) when a workload wants to connect to it and join the secure mesh; or maybe it's something else you have in mind?

fitzthum, "It seems likely that we'll need to support more than just the localfs backend in the longterm. Currently we allow people to swap out localfs for something else at build time, but nothing else is supported and you can only have one backend. We will probably want some framework for supporting different types of resources at the same time."

There's a curiosity in the code related to this exact comment. "Repository" takes on two different meanings. There's a "Repository", which is a trait/interface here; but there's also a "repository" passed to get_resource in the URI (see here), and which ultimately is used as part of the path in the local-fs here. fitzthum's raising a good point: We can only support local-fs right now, but how would we support different types of resources? It seems this support is almost baked into KBS already, except that the repository that's grabbed from the URI needs to actually be used to select a different repository (not some folder in the local-fs's path); and executing repository.read should read from that user-provided repository. This is in the spirit of cclaudio's implementation option (1), and potentially without the need to burn the first URI field with the string "plugin" (and instead just use "nebula" or whatever is desired there).

fitzthum commented 1 month ago

@portersrc really good point about the repository. Maybe we should just use the repository field of the resource URI to select which repository backend you want to use. We could assign default to be localfs.

If we don't do that we should rename the Repository trait because it is potentially confusing.

Xynnn007 commented 1 month ago

xynnn007, "For other scenarios, for example, this cert could be used as a "client identity" to access resources."

(We had a similar train of thought when looking at this; could be appealing, but we've tabled it for now.)

Sounds good. Maybe we could start this thread again. Logically, once initdata is implemented, we could have a way to inject something into the guest to work as a client id for registering a certificate. Hopefully we could get a converge on the design of this : )

@portersrc really good point about the repository. Maybe we should just use the repository field of the resource URI to select which repository backend you want to use. We could assign default to be localfs.

If we don't do that we should rename the Repository trait because it is potentially confusing.

I agree. Storage, KVStore or something else might be better. And a good point is that we could have some more configurations for KBS to make different resource path to link to different backend.

mkulke commented 1 month ago

mkulke, "re 3) ... Implementing a specific API service to generate those ad-hoc secrets ... enlightened to support remote attestation via RCAR."

Can you clarify a bit here? This looks non-trivial to me, at least compared with cclaudio's extension of the KBS resource structs and functions. At first I imagined you meant that the same token from the KBS could be reused to fetch more resources from this new nebula service; but maybe you mean that this new nebula service would just have similar RCAR facilities (and a new session and token) when a workload wants to connect to it and join the secure mesh; or maybe it's something else you have in mind?

ah, I meant that the there could be a lightweight nebula-specific kbs that re-uses kbs_protocol and has it's own notion of resources.

However I like the idea of making "/$resources" abstract instead of prescribing a hierarchy that attempts to map to all potential use cases. However, I don't think it's a good idea to map resource uris to implementation details like storage backend,that should be of no concern to the API client (unless this is a feature of the API to pick storage tiers, but I wouldn't recommend to go there).

portersrc commented 1 month ago

mkullke, "...a lightweight nebula-specific kbs that re-uses kbs_protocol and has it's own notion of resources..."

Got it. Yes, this could work. I feel that because KBS already has the notion of multiple backing stores; the fact that it only supports local-fs right now; and the relatively straightforward integration between nebula resources and KBS' get-resource endpoint -- there's a cleaner path right now that doesn't involve splitting out a separate nebula-specific KBS.

mkulke, "I like the idea of making "/$resources" abstract ... However, I don't think it's a good idea to map resource uris to implementation details like storage backend"

Good point. Does this mean that you prefer the idea of using a /plugin URI, or are you fine with using anything for the first token in the URI triplet? More importantly, I agree that coupling the URI string with the actual backing store could be bad, but it's not clear to me how one would nicely implement this.... What would you switch on? If it's not repository from repository/name/tag, then is it repository/name? Etc. Another way to phrase this: "What rust would we want to write for this?" --a function that maps request URIs to backing stores?

mkulke commented 1 month ago

Good point. Does this mean that you prefer the idea of using a /plugin URI, or are you fine with using anything for the first token in the URI triplet? More importantly, I agree that coupling the URI string with the actual backing store could be bad, but it's not clear to me how one would nicely implement this.... What would you switch on? If it's not repository from repository/name/tag, then is it repository/name? Etc. Another way to phrase this: "What rust would we want to write for this?" --a function that maps request URIs to backing stores?

If we want to include something like Nebula, we would need to cover the use case in the API at least. Currently, the path triplet /repo/name/tag is currently passed to rego policy evaluation in KBS, so we would have to consider dynamic parts (?ip=1.2.3.4) too or generalize the resource descriptor somehow.

Also if Nebula resources need a TTL, we need a component managing the lifecycle of resources, ...

portersrc commented 1 month ago

Currently, the path triplet /repo/name/tag is currently passed to rego policy evaluation in KBS, so we would have to consider dynamic parts (?ip=1.2.3.4) too or generalize the resource descriptor somehow.

Yes, I think both, maybe: pull extra params and put them into the resource descriptor. Make sure those get passed to the policy engine here. And then it's up to the user if they want a rego policy that restricts those params in some way.

Also if Nebula resources need a TTL, we need a component managing the lifecycle of resources, ...

I think that's right, but I didn't imagine that as (necessarily) being part of KBS or its get-resource endpoint. The nebula-lighthouse (outside of the scope of this issue, I'd say) is a service for the mesh and where cert rotation could be triggered, for example.

@mkulke just on the question of a more generic interface that "solves" the backing store question, do you have any strong opinions there? So, setting aside nebula, how would you prefer to support multiple backends/plugins? as plugins that could be added in addition to the repository backend (presumably triggered by a reserved word somewhere in the resource URI) or as different backends (where some part of the resource URI would always correspond to the backend)?

mkulke commented 1 month ago

I think we should distinguish between the API, that is conceptual considerations (e.g. kbs-prescribed hierarchies/ontologies) and storage implementation details (where/how is the data stored). I'd ignore the latter for the time being.

e.g. on s3 buckets there is a flexible notion of "path", it's just part of an object's key that happens to contain "/"s, so a user can organize their secrets in a conventional hierarchy, but we don't have reflect that in the storage.

I would not recommend to ignore HTTP semantics. if we want to do arbitrary RPC we should use gRPC.

e.g. in the suggestion /kbs/v0/resource/plugin/<plugin_name>/<plugin_action>?ip=10.10.10.10&name=pod1, we'd multiplex "plugin actions" over a generic GET request that produces a side-effect. this might break assumption that http middlewares/clients have for restful apis.

I could imagine a specific handler per store, could be /v1/resource/{store}/{accessor} which could look like:

In this model each store would implement their own parameter extractor and provide the parameters in some normalized form to the policy engine.

The other option would be to generalize the current scheme:

GET|POST|DELETE /v1/resource/{store}/{path}?{query_params} {body}

We'd extract path and have prescribe some semantics for query_params and body, which we then can provide it to the policy_engine. Not every store would need to implement all verbs.

I think I favor the first options, with specialized handlers for individual stores. Custom stores we could include at build-time behind a feature flag from a kbs-contrib folder or something.

fitzthum commented 1 month ago

e.g. in the suggestion /kbs/v0/resource/plugin//?ip=10.10.10.10&name=pod1, we'd multiplex "plugin actions" over a generic GET request that produces a side-effect. this might break assumption that http middlewares/clients have for restful apis.

I think plugin_action might be the wrong term here, because it implies that the plugin can do more than just get a resource. I don't think that's the intention. I think using a query string with a GET request is common in REST and does not imply that the plugin is stateful.

I am not a Nebula expert, but I think this key generation functionality that the plugin in question would expose is not stateful. It simply requires more input parameters than would fit in the resource uri, hence the query string. Nebula has another component, the lighthouse, that keeps track of connections, but this plugin is just for generating certs. (There is state involved for keeping track of the private key and such but it is not updated by requests from guests)

Overall, this plugin or virtual resource idea would still need to conform to the semantics of get_resource, meaning that it would only be used for getting resources (although some would be dynamically generated). I don't think this proposal is trying to support stateful plugins. Keep in mind that today there is no support for POSTing resources to any of the backends from the guest. This sort of configuration can only done by the KBS operator.

I could imagine a specific handler per store, could be /v1/resource/{store}/{accessor}

I'm not sure I understand the details here. You are suggesting adding an additional element to the path (after resource, but before the resource_uri) that would point to a different "store?" It seems like this would make things more complicated in the guest. How would the guest know which store contains which resources?

The resource URI is supposed to be a universal locator of a resource. It seems like the plugin/store information should be included in that.

n this model each store would implement their own parameter extractor and provide the parameters in some normalized form to the policy engine.

Integration with the policy engine seems trivial for any of the approaches suggested.

mkulke commented 1 month ago

I think plugin_action might be the wrong term here, because it implies that the plugin can do more than just get a resource. I don't think that's the intention. I think using a query string with a GET request is common in REST and does not imply that the plugin is stateful.

ohh. thanks for clearing this up! It's actually mentioned in the original description, I somehow skipped over that. if we don't have to think about state in this context, it should be relatively straightforward, we just need a facility to parameterize a resource that's more flexible than the repository/name/tag triplet, right?

I'm not sure I understand the details here. You are suggesting adding an additional element to the path (after resource, but before the resource_uri) that would point to a different "store?" It seems like this would make things more complicated in the guest. How would the guest know which store contains which resources?

the idea was to turn the repository segment into a selector for a given store/plugin (.../resource/secrets, ../resource/nebula, ...) while the remaining path and query url segments don't follow a strict hierarchy or schema (like paths in s3 buckets or oci images). The alternative would be that each store/plugin maintains their own url scheme.