Open tombentley opened 4 months ago
Is there any inheritance type relationship between defaultPolicy
and topicPolicies[*]policy
? Like does setting the cipherSpec in defaultPolicy impact the topicPolicies? Or is it more that cipherSpec
would have a default value and doesn't need to be set in any particular policy.
Is there any inheritance type relationship between
defaultPolicy
andtopicPolicies[*]policy
? Like does setting the cipherSpec in defaultPolicy impact the topicPolicies? Or is it more thatcipherSpec
would have a default value and doesn't need to be set in any particular policy.
I was thinking that:
defaultPolicy
(inevitable anyway, when we added features requiring config parameters after adopting this schema, in order to guarantee backwards compatibility when reading older config files). As an example, if you don't specify defaultPolicy.aad
then it would be as if you had set defaultPolicy.aad
to that default. I don't think we could have a built-in default for keyEncryptionKey
, at least. topicPolicies
override what's in the defaultPolicy
. So if you don't set something at the topic level and then change the default then you'd get the changed thing for that/those topic(s).I suppose there's the possibility for some confusion, because I've said that the empty policy {}
means "no encryption", whereas
defaultPolicy
were optional then you'd expect it to mean "the built-in defaults".{}
to mean, "the default policy", and unlike the case for defaultPolicy
, {}
ought to be legal at the topic level.So maybe we should make "no encryption" more explicit, by using keyEncryptionKey: null
. When it's null then none of the other properties would be allowed.
Wdyt?
Thanks @tombentley thats quite a full and well thought through proposal.
I have a couple of high(er) level thoughts. It feels like a complicated schema which can apply to all our encryption use cases, but I'm not sure we need or even want one schema which expresses them all! While its appealing from an implementation perspective I'm not sure its helpful to users. They might be better served by having simpler schemas which are more directly connected to each of the filters. I wouldn't have a problem if we were to convert the dedicated schemas into a model like this internally
~In your model how would a user express I want a default encryption policy of X but I only want it applied to certain topics. As the default is applied everywhere not otherwise matched.~ Oh wait you do cover that in point 6.
My other thought is instead of (or maybe as well as) a defaultPolicy key we should offer a map of policy names to policies, and then allow them to be referenced in the topicPolicies
section. That would separate the concerns of policy definition from the topics it applies to and allow groups of topics to share policies. Which would help in the case of a multi tenant deployment where only specific topics would be encrypted but all would have a common prefix.
Hmm that leads to another thought (its not really new per se). There is actually an ordering requirement between recordEncryption and any filter that does topic name manipulation. Say we put the multi tenant filter first we are going to screw up all prefix matching. Does that mean we need another mapping mechanism, such as header to key? (the topic mangling filter could add a original_topic_name
header and the record encryption filter uses that to find the policy).
I can see the appeal of all the various topic matchers and expression language support but I'm a little concerned about the performance costs of all of that. Particularly for large clusters shared between multiple app teams that can self serve topics and policies. Its probably a suck it and see situation but I wanted to give voice to the thought.
Some more detail oriented points.
When there exists a startingWith that's a prefix of another startingWith then the longer (more specific) startingWith matches.
Should we also log a warn in this case as there is the same possibility for confusion as the named
& starts_with
clash?
You've used both includeTopics
and includedTopics
as keys I'm presuming includeTopics
is a typo?
Thanks for responding @SamBarker
They might be better served by having simpler schemas which are more directly connected to each of the filters
I don't quite understand what you mean. Could you elaborate?
My other thought is instead of (or maybe as well as) a defaultPolicy key we should offer a map of policy names to policies, and then allow them to be referenced in the topicPolicies section.
It's not a bad idea, but isn't that what YAML anchors and references are intended to support? There seems to be mixed reports of Jackson's support for this but it might be worth experiementing with it before deciding on this point.
Which would help in the case of a multi tenant deployment where only specific topics would be encrypted but all would have a common prefix.
I think there are different possible models for multi-tenant + encryption:
...
There is actually an ordering requirement between recordEncryption and any filter that does topic name manipulation. Say we put the multi tenant filter first we are going to screw up all prefix matching.
... Given the above I think you mean something like the latter,. To be honest I don't have a good feeling for how often real user requirements would mandate that. If it were common it's harder to say it's towards the RHS of "simple things should be simple, hard things should be possible". But it's a good example to show how the matching approach can break down when what you're trying to match depends on other filters. Hmm.
I can see the appeal of all the various topic matchers and expression language support but I'm a little concerned about the performance costs of all of that. Particularly for large clusters shared between multiple app teams that can self serve topics and policies. Its probably a suck it and see situation but I wanted to give voice to the thought.
Performance is secondary right now, to understanding how a model like this (or any other model) applies in various use cases. And the MT + encryption example is a good one.
When there exists a startingWith that's a prefix of another startingWith then the longer (more specific) startingWith matches.
Should we also log a warn in this case as there is the same possibility for confusion as the named & starts_with clash?
That feels more likely to be intentional to me. It's the mechanism for carving out an exception to a more general rule; it would apply to the defaultPolicy
and any other prefix matcher, for instance.
I'm presuming
includeTopics
is a typo?
Thanks, consistency restored.
They might be better served by having simpler schemas which are more directly connected to each of the filters
I don't quite understand what you mean. Could you elaborate?
What I'm trying to get at is we should think about specific schemas /config models for each application of encryption rather than one model to fit all use cases. Looking at your proposal with fresh eyes its maybe not quite as generically applicable as I thought last night. However to follow through on the thought:
encrypt
node in the policy being expanded to cover fields within a message. Which would absolutely make sense to do for a field level encryption filter but would just be confusing in the recordEncryption filter. keyEncryptionKey: "user-${utf8(record.headers['username'])}"
might be better served by an explicit
keyPerHeader:
header: 'username'
kekId: ''
persistIn: 'jdbc://mongoDb'
I'm not suggesting that is a real object but just that we might want different properties in different contexts. As I say I probably treated it as overly generic previously.
It's not a bad idea, but isn't that what YAML anchors and references are intended to support? There seems to be mixed reports of Jackson's support for this but it might be worth experiementing with it before deciding on this point.
YAML anchors might work well for people hand editing YAML but I don't think they provide a great option for people trying to manage the config through a higher level operator/platform. How does it know which policies need anchors? How does it generate an anchor? Its going to want to provide a set of known policies etc I think a policy map provides simpler way to interact with the config without really costing us in complexity.
To be honest I don't have a good feeling for how often real user requirements would mandate that. If it were common it's harder to say it's towards the RHS of "simple things should be simple, hard things should be possible".
I don't have a good picture either, but a lot of the interest I've seen suggests it might be quite common. As I think the multi tenant filter plays nicely into the platform engineering movement if we can address some of the noisy neighbour issues it could well be quite a common deployment pattern.
What I'm trying to get at is we should think about specific schemas /config models for each application of encryption rather than one model to fit all use cases.
OK, that makes more sense, thanks for clarifying.
The problem I have with that is it's not clear how many or few those use cases really are, and how discrete they are. I suspect they form a spectrum which isn't very amenable to specific schemas. Using your example, what if you want to select a kek based on both the user name (in a header) and something else? I'm sure we could say "people can write their own plugins", but I think the person who wants to configure this might be more ops than dev and might not relish having to whip out an IDE to write some java plugin just to be able to say kekEncryptionKey: ${topicname}-${user}
. So maybe that works as a strategy, or maybe not.
The things I liked about the expression language idea were:
record
and topic
to be used in the expression however the user wants to use them.channel
, principal
).
tenant
variable if there's a MT filter in the chain.On the other hand, I'm not aware of any off the shelf expression language with exactly the features I'd like.
I can easily imagine the encrypt node in the policy being expanded to cover fields within a message. Which would absolutely make sense to do for a field level encryption filter but would just be confusing in the recordEncryption filter.
Yes, I wrote it with that possibility in mind. But let's not get side tracked on that right now.
@tombentley thanks for putting the proposal together. I really like it.
keyEncryptionKey
- should that be keyEncryptionKeyAlias
or simply alias
to adhere to the naming scheme already used by the KMS
What if an expression such as ${utf8(record.headers['username'])}
failed to resolve at runtime? e.g. there was no username field in the header or the username resulted in invalid UTF-8. Presumable we'd fallback to the defaultPolicy or reject the batch of messages? We'd need to think about how to let the user know about the runtime failures.
If we had such flexibility in deriving the keyAlias from the record, it wouldn't be long before there was a feature request for the ability to create a key in the KMS on-the-fly. I'd see it is a separate piece of work, but I wanted to mention it here.
keyEncryptionKey
- should that bekeyEncryptionKeyAlias
or simplyalias
to adhere to the naming scheme already used by the KMS
It's an alias, so that probably deserves to be in the name. And keyEncryptionKeyAlias
is too verbose to make the user repeat. And KEK is not a very pretty acronym. So maybe alias
, or keyAlias
is a good choice.
What if an expression such as
${utf8(record.headers['username'])}
failed to resolve at runtime? e.g. there was no username field in the header or the username resulted in invalid UTF-8. Presumable we'd fallback to the defaultPolicy or reject the batch of messages? We'd need to think about how to let the user know about the runtime failures.
Exceptions are a bit tricky. If the header is missing, then I think we should reject that batch of messages, because it indicates that the expectations of the person configuring the system haven't been met (i.e. it's most likely a configuration error). Invalid record data is slightly different (the producer is producing something unexpected), but I think we should also reject.
If we had such flexibility in deriving the keyAlias from the record, it wouldn't be long before there was a feature request for the ability to create a key in the KMS on-the-fly. I'd see it is a separate piece of work, but I wanted to mention it here.
Absolutely it's a separate piece of work.
@gracegrimwood I'm interested in your PoV on this one too.
Another thing... this proposal uses ${topic.name}
and in doing so establishes a precedent which would allow us to add support for things like record.headers
, record.key
and record.value
later on. But we actually support ${topicName}
currently. We should decide whether we actually want to switch, and allow users to write topic.name
in 0.5.0.
piggybacking a small feature request onto this issue - RecordEncryption
currently hardcodes 5 million encryptions per DEK and it would be nice if that value were configurable - or, failing that, if it were set low enough not to encrypt more data than would weaken the GCM key given a suitably pessimistic average message size.
After reading that last half sentence, I think it'd be better for it to be configurable. I don't personally have a use-case for having different DEK rotation requirements for different levels of sensitive data, so a single value is enough for me.
edit by @k-wall, I think this deserves to be a separate issue. Raised as https://github.com/kroxylicious/kroxylicious/issues/1289
another other thing - given #1226 defines ways to select which filters apply, does there need to be more than one policy per encrypting filter? i.e. once topic selectors are a thing, would it not be better to define multiple encryption filters each with a disjoint set of topics it cares about?
I wanted a place to discuss the vision for how to configure the encryption filter. I'm using this issue.
Here's an annotated example to illustrate something which I think should work and be flexible:
defaultPolicy
applies to any topic that's not matched by an element intopicPolicies
(see 2). It's equivalent to specifying an element intopicPolicies
with an include prefix of "" (because the empty string is a prefix of all strings). Within this object we can specify things which are currently in theEncryptionScheme
record. We usekeyEncryptionKey
to name the key (within the KMS) to the used for encryption. If the default is not to encrypt at all you can specify an empty defaultPolicy,defaultPolicy: {}
. We can decide on sensible defaults of each of the other properties here.defaultPolicy
can be overridden for specific topics.includeTopics
object. The criteria individually select topics. Thepolicy
(see 6) is applied when a topic matches any criterion. It's possible to narrow an inclusion with an exclusion (i.e. abutExcludeTopics
property sibling toincludeTopics
; not shown).policy
object supports exactly the same schema as thedefaultPolicy
. You can saypolicy: {}
to mean "no encryption" in the same way.keyEncryptionKey
is subject to interpolation of expressions within${
and}
.utf8
. In this example, its evaluation might result in using the key encryptionuser-tombentley
. We don't have to agree on the expression language as part of this discussion.jsonpath
.jsonpath
parses its first argument as JSON and then returns the result of evaluating the jsonpath given in its 2nd argument against that JSON structure. This example demonstrates a possibly mechanism for getting KEK aliases from within fields of a record key. In this example, its evaluation might also result in using the key encryptionuser-tombentley
Some properties of this proposal to highlight:
defaultPolicy
and topicpolicy
schemas can evolve over time if we add more features. In particular they cannot collide with future changes to the topic inclusion/exclusion mechanismnamed
orstartingWith
. e.g.named: [foo, foo]
is not allowed.named
must be empty. e.g. you can't have two topic policies both withnamed: foo
.startingWith
must be empty. e.g. you can't have two topic policies both withstartingWith: foo
.named
that's equal to astartingWith
then thenamed
matches. e.g. it is allowed have onenamed: foo
and anotherstartingWith: foo
, but this should result in aWARN
being logged, because of the scope for confusion.startingWith
that's a prefix of anotherstartingWith
then the longer (more specific)startingWith
matches.user_*
topics, it's not proposedtopic
andrecord
. Doing this would allow some optimizations. If the expression doesn't depend on therecord
then there's no need to iterate and decompress the batch.