Open pflynn-virtru opened 6 months ago
We should also support key splitting explicitly somehow
Are we going to change the "L1L" magic number?
@patmantru The "L1L" contains the version which will change if the specification is revised. A revision would include a binary format change, or perhaps even a reinterpretation of an existing field.
see https://github.com/opentdf/spec/tree/main/schema/nanotdf#3311-magic-number--version
@dmihalcik-virtru key splitting will not be covered by this ADR nor any ADR in foreseeable future
@patmantru As a fun aside, L1L also results in a BASE64 encoding that starts with "TDF..."
@pflynn-virtru This might be another option. Have you considered leveraging the UR
I or URN
pattern to fetch KAS information from the platform itself? This way, an administrator could control the host, port, and path that the KAS maps to.
For example, if we keep the KAS hostname, port, and path in a NanoTDF or ZTDF, an administrator could never really tear down those records without risking breaking existing TDFs. This approach also allows the platform to dictate which KASs are trusted.
Example: urn:kas:kas-1.virtru.com:kid:123
We should also consider using this format for ZTDF to address similar issues there.
Two options exist: βοΈ Resource Locator format: URN βοΈ Resource Locator format: URL query parameter
I have added issues with URL case. I have added ZTDF change needed too.
Would one of these approaches be better suited for situations where OpenTDF is deployed in a multi-KAS environment or handle situations where maybe IT wants to migrate obscure endpoints from kas[1-100].example.com
to more descriptive endpoints like mkt[1-5].kas.example.com
, eng[1-5].kas.example.com
, exec[1-5].kas.example.com
, etc?
One bit of experience I have is with Virtru's Secure Reader product. We offered the ability to tie policy with custom domains (CNAMEs) as the product aged we learned that customers wanted the ability to change those either due to a company rebrand, acquisition, or even an IT policy change that required domains to comply with a universal org policy.
By binding emails directly with the company CNAME it meant that the company would have to hold that domain indefinitely or risk breaking access to old emails. I would encourage learning from this experience and make sure we reduce this risk.
For instance, managing one domain indefinitely is much less burdensome than N domains (or subdomains) per deployed KAS.
Would one of these approaches be better suited for situations where OpenTDF is deployed in a multi-KAS environment or handle situations where maybe IT wants to migrate obscure endpoints from
kas[1-100].example.com
to more descriptive endpoints likemkt[1-5].kas.example.com
,eng[1-5].kas.example.com
,exec[1-5].kas.example.com
, etc?
In this case I think going with a uri or urn approach would be better suited with the core platform holding the necessary information to connect to kas.
Imagine if you needed to change the kas endpoint 5 times. I think a user would have to maintain a new cname record every time it's changed.
@jrschumacher
One bit of experience I have is with Virtru's Secure Reader product. We offered the ability to tie policy with custom domains (CNAMEs) as the product aged we learned that customers wanted the ability to change those either due to a company rebrand, acquisition, or even an IT policy change that required domains to comply with a universal org policy.
By binding emails directly with the company CNAME it meant that the company would have to hold that domain indefinitely or risk breaking access to old emails. I would encourage learning from this experience and make sure we reduce this risk.
For instance, managing one domain indefinitely is much less burdensome than N domains (or subdomains) per deployed KAS.
This is my main concern. Weβre shifting complexity to the infrastructure. For instance, if a port number other than 443 is used and then changed, does that mean those TDFs become inaccessible unless the infrastructure is constantly maintained? This could lead to significant challenges in ensuring continuous access to TDFs that were previously generated.
The current solution feels brittle the more we dig into it.
After the Architecture meeting the following has been decided:
Rough implementation impact:
Before I forget want to note this down. KAS is really an interface to the policy keys. There should be no reason I can't load the keys into another kas and decrypt my data as long as my entitlements match the resource attributes. We don't tie a key to a kas in anyway.
@pflynn-virtru this is a really well written ADR and the back and forth discussion in an open forum like this between you, @strantalis and @jrschumacher is amazing.
I will pile on and just say I agree with the comments from both Ryan and Sean. We need to be more flexible to infrastructure changes, particularly with something as important as accessing keys.
NanoTDF KAS resource locator path and key identifier
Context
Problem
The NanoTDF specification requires enhancements to support key identifier and multiple ways to access KAS.
See https://github.com/opentdf/spec/tree/main/schema/nanotdf#3312-kas
Example body with protocol values:
How to access a KAS
/v2/rewrap
or/kas/v2/rewrap
or/rewrap
to URL from step 2 (varies by SDK)Which KAS key
As we introduce multiple KAS keys and perform key rotations, we need a key identifier
kid
used in creating the NanoTDF so a rewrap operation can use the same key.Policy Key Access
This section allows for an ephemeral key other than the Payload key to encrypt the policy.
See https://github.com/opentdf/spec/tree/main/schema/nanotdf#342323-optional-policy-key-access
Goal
kid
kid
or public keykid
on decryptionkid
on encryptionRelated
Included here because a possible version change to NanoTDF specification could influence this decision.
Decision
βοΈ Add or Use Key Identifier section
See https://github.com/opentdf/platform/pull/1199
See https://github.com/opentdf/spec/tree/main/schema/nanotdf#342323-optional-policy-key-access
Rationale: Only KAS needs to know the public key or
kid
. It has no impact on how to access KAS.Changes:
Add new section to Header - Key identifier 32B - 133B
Recommended name "Payload Key Access" section
π¨ Similar to Policy Key Access
π₯ NanoTDF version update
βοΈ Add Key Identifier to Policy
See https://github.com/opentdf/platform/pull/1197 Rationale: KAS presents multiple keys available to a client. The client determines which key based on the use case policy.
Changes:
kid
field with compute and format specificationExample attribute
urn:opentdf:kas:ec:secp256r1:43:51:43:a1:b5:fc:8b:b7:0a:3a:a9:b1:0f:66:73:a8
urn:ietf:params:oauth:jwk-thumbprint:sha-256:NzbLsXh8uDCcd-6MNwXF4W_7noWXFZAfHkxZsRGC9Xs
π© Policy section is relatively large
π© Policy section is processed by KAS
π© Policy section has binding
π¨ NanoTDF specification clarification
βοΈ Add Key Identifier to KAS Resource Locator
See Specification https://github.com/opentdf/spec/pull/40 See Implementation https://github.com/opentdf/platform/pull/1222
Rationale: KAS has multiple keys available to a client. The client determines which key based on the use case policy.
Changes:
Add recommend way to add a
kid
to a policy with default implementation in KAS and SDKπ¨ Resource Locator is 257B
π¨ NanoTDF specification clarification
π₯ KAS must parse a freeform URL fragment created by SDK
π₯ No binding, attack vector for public key thumbprint
βοΈ Resource Locator format: URN
Rationale: Introducing a URN type that includes both domain and identifier enhances the ability to uniquely and efficiently reference resources.
Changes:
urn:opentdf:kas:<domain>:<identifier>
.virtru.com:01edksqtx9cfzzt1y9sm57h3yq
<identifier>
in ULID 16 bytesurn
βοΈ Resource Locator format: URL query parameter
See https://github.com/opentdf/platform/pull/1190
virtru.com/api/kas?kid=435143a1b5fc8bb70a3aa9b10f6673a8
:
Decision
TBD
References