Closed apelisse closed 11 months ago
cc @thockin @jpbetz
Do we have the list of fields that fall into this "marked as listType=map but not validated as having unique keys" category?
Not really but, that's a great question:
I considered adding a new, separate tag, but that would require a KEP and make things impossible to backport. I think we need to make it clear that this is a work-around and shouldn't be used for any new list (again, only for built-ins so we have a lot more control over that).
There are two distinct failure modes, the one we have usually run into is client-side apply vs. Service.spec.ports
, where the keys are looser than validation, and SSA is the SOLUTION (the keys match validation).
The more recent one is where SSA's keys are more strict than validation. So far we've found:
Pod.spec.containers[].ports
Pod.spec.containers[].env
Pod.spec.containers[].envFrom
I admit that I haven't scoured the API for this case, and I don't honestly know how I would in any reasonable amount of time. :( IOW, there's ikely to be more
I did a scan. Here's what I found:
A second pair of eyes on this would be really helpful. I only looked at the validation code. I haven't actually tried all these cases.
I wouldn't be surprised if some people think this will automatically write the validation for them. For most people authoring new resources and using this, it should tbh. Maybe we should generate and have an opt-out flag?
Maybe we should generate and have an opt-out flag?
https://github.com/kubernetes/enhancements/pull/3938 proposes that we automatically validate. An opt-out flag sounds useful.
For these 4 ports cases, should we chnage listMapKey to 'name' to match what is validated? Can SSA handle a map key change?
For these 4 ports cases
Which 4?
If you mean:
Container.Ports - validates unique name, listMapKey=containerPort, protocol
I don't think that's correct, name is unique but I don't know if that's good enough, name isn't required.
I don't think that's correct, name is unique but I don't know if that's good enough, name isn't required.
Ah, we don't have a key then, because name is the only field that validation checks when looking for duplicates.
We might need to go with atomic?
Actually, if we're careful we might be able to make it work with name, especially if it's omitempty, we can default it to "". Then as you mentioned, we would never have two keys with the same name.
If that works by the way, that might be a great way to fix a lot of these problems. For env, we might be able to make a key with name AND value. Then we can only have one such pair, I think that's fine, we can still merge the list properly.
If that works by the way, that might be a great way to fix a lot of these problems. For env, we might be able to make a key with name AND value. Then we can only have one such pair, I think that's fine, we can still merge the list properly.
I've love to see a test suite trying out various operations on a listType=map defined this way. For CEL in CRDs we defined map key to be the CEL expression and it works just fine...
I've love to see a test suite trying out various operations on a listType=map defined this way.
We have a lot of tests in this repo, what do you have in mind? We already have multi-key tests here, that should work fine. We can reason and see what happens (and maybe fix) when we change the keys, I think we had thought about this a while back.
We can reason and see what happens (and maybe fix) when we change the keys, I think we had thought about this a while back.
Good point.
Maybe open a PR that tries to fix the broken cases and we can manually try various apply requests out?
A=x{$A}
twice... should we go this route or just mark it as atomic?ObjectMeta.OwnerReferences worries me..
Actually, if we're careful we might be able to make it work with name, especially if it's omitempty, we can default it to "". Then as you mentioned, we would never have two keys with the same name.
For ContainerPort, the name is optional. If specified, it must be unique, but many ports may be named "".
Ports: Use enough fields to ensure uniqueness in each case? E.g. all the fields of ContainerPort?
Sadly, we allow total duplicates today, as long as hostPort
and hostIP
are both unspecified. It's meaningless, but allowed.
PodSpec.TopologySpreadConstraints - validation missing, listMapKey=whenUnsatisfiable
I see topologyKey + whenUnsatisfiable but I find validation code in validateTopologySpreadConstraints()
ServiceSpec.Ports - same as ContainerPorts (except keys are port, protocol)
This is validated in ValidateService()
// Check for duplicate Ports, considering (protocol,port) pairs
portsPath = specPath.Child("ports")
ports := make(map[core.ServicePort]bool)
for i, port := range service.Spec.Ports {
portPath := portsPath.Index(i)
key := core.ServicePort{Protocol: port.Protocol, Port: port.Port}
_, found := ports[key]
if found {
allErrs = append(allErrs, field.Duplicate(portPath, key))
}
ports[key] = true
}
PodSpec.ImagePullSecretes - validation missing, patchMergeKey=name
Agree there's no validation - I think this should be atomic
PodSpec.HostAliases - validation missing, patchMergeKey=ip
Agree there's no validation - I think this should be atomic
ServiceAccount.Secrets - validation missing, patchMergeKey=name
Agree there's no validation - I think this should be atomic
ObjectMeta.OwnerReferences - validation missing, patchMergeKey=uid
Agree this is not validated unique, but it is validated non-empty. This one is hard because I can see people wanting to co-own it.
LabelSelectorRequirement.Key - validation missing, patchMergeKey=key
This field is a string - what does it even mean to have a merge key?
To be clear, most lists should not be atomic. Atomic should be used for things that can't be a mix of multiple things, or that are very tightly coupled, kind of like a number (wouldn't make sense to let different people change different digits), or a string (we should never let different actors change different parts of a string).
We should try really really hard to not make these lists atomic and give them a good way to be updated by multiple actors with different opinions. Here is a litmus test: if two actors (users and/or controllers) can have different opinions of a list that are not mutually exclusive, then the list shouldn't be atomic.
Can two users want different environment variables to be set? Can two users want different sets of ContainerPorts? Can they each have their secret? Can you have multiple ownerReferences, can you have multiple LabelSelector etc... None of them should be atomic.
Actually, atomic is breaking Ci/CD systems so much that I'd rather find a work-around to make things not completely break than make them atomic.
LabelSelectorRequirement.Key - validation missing, patchMergeKey=key
This field is a string - what does it even mean to have a merge key?
That has to be a bug 😂
For StorageVersion: https://github.com/kubernetes/kubernetes/pull/118520
That has to be a bug 😂
It is. It got added by https://github.com/kubernetes/kubernetes/pull/44121 but then got missed when the bad merge keys were cleaned up by https://github.com/kubernetes/kubernetes/pull/50257.
Here's a fix: https://github.com/kubernetes/kubernetes/pull/118522
Antoine, I hear what you are saying but look at these fields. I think most of them are exceedingly unlikely to be co-owned, but I admit some might be.
For some, like ContainerPort, we could add all the fields, if only SSA would not barf on it (but could dedup it). I don't know if that would work on any others.
Env, as a pathological example, is just NOT a map. It's not. It is sequential. We would need some sort of append-merge.
On Tue, Jun 6, 2023, 6:21 PM Joe Betz @.***> wrote:
That has to be a bug 😂
It is. It got added by kubernetes/kubernetes#44121 https://github.com/kubernetes/kubernetes/pull/44121 but then got missed when the bad merge keys were cleaned up by kubernetes/kubernetes#50257 https://github.com/kubernetes/kubernetes/pull/50257.
Here's a fix: kubernetes/kubernetes#118522 https://github.com/kubernetes/kubernetes/pull/118522
— Reply to this email directly, view it on GitHub https://github.com/kubernetes-sigs/structured-merge-diff/issues/234#issuecomment-1579700882, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABKWAVFZA3MXW6UGXM75LC3XJ7JQVANCNFSM6AAAAAAY3RQOD4 . You are receiving this because you were mentioned.Message ID: @.***>
~Fix for container and ephemeralContainer ports: https://github.com/kubernetes/kubernetes/pull/118536~
EDIT: container fix will be more challenging than I originally realized. Identical container ports are allowed unless a name or hostPort is set. Problematic example (which I just verified is supported on a cluster):
apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
- containerPort: 80
I've verified that all remaining cases (env, ports, imagePullSecrets, hostAliases, serviceAccount.Secrets, ownerReferences) allow exact duplicates. This means these none of these cases cannot be fixed solely by changing the merge keys.
This simplifies things in my mind because it means we either have to handle exact duplicates in SSA, or we have to use atomic.
It is interesting that ownerReferences
is deprecating duplicates. The warning it returns for duplicates is:
Warning: .metadata.ownerReferences contains duplicate entries; API server dedups owner references in 1.20+, and may reject such requests as early as 1.24; please fix your requests; duplicate UID(s) observed: 421f3acc-5bfd-4007-b28a-4b0749c95888
ports
and serviceAccount.secrets
have no warnings. We should add those.
Should we also consider if some fields should have a warning that we will stop allowing exact duplicates in the future?
I am skeptical that we can or should ever reject duplicates, but yes to warnings.
I am skeptical that we can or should ever reject duplicates, but yes to warnings.
SGTM. When adding the warnings for the fields that do not yet have warnings, we could also update the warning for ownerReferences
to stop saying that we might start rejecting requests in 1.24 (since it's over a year ago anyway).
From the discussion above it is not entirely clear on what you will consider valid on container ports on pods. Will duplicate portnumbers with different port names still be considered valid? They are useful to aggregate pods to different services for different functions without having to expose a different port (requiring app changes).
That depends on the operation. create
and replace
will preserve pod-ports with the same number but different name. apply
will merge them based on port number+protocol (which is how the merge-key is defined)
also related to https://github.com/kubernetes/kubernetes/issues/113482
Fixed by #254
Right now, we have some lists that are marked as
listType=map
that are not strictly maps in Kubernetes. In that case, everything goes wrong in smd because the list is invalid (and can't be handled), so it's fully rejected. We can't even track the fields, which causes us to fail and return an error which is ignored by Kubernetes, and the managed fields are entirely reset.Unfortunately, it means that objects that have such a list can't be server-side applied at all. We think it'd be simpler to merely ignore these (fairly rare) occurences and try to do the best thing possibe rather than fail.
This means: