kubernetes-sigs / structured-merge-diff

Test cases and implementation for "server-side apply"
Apache License 2.0
97 stars 58 forks source link

merge: Add tests for merging with lists that have duplicates #244

Closed apelisse closed 11 months ago

apelisse commented 1 year ago

Tried to capture some of the discussions we've had in the form of unit-tests to make sure we properly capture the behavior we want. Let me know if there are use-cases that I'm missing.

I have basically found three cases:

  1. Merging with an object that has a list with duplicates, but the change is unrelated to the list
  2. Merging with an object that has a list with duplicates, the change modifies the list, but not a duplicated item
  3. Merging with an object that has a list with duplicates, the change modifies a duplicated item

I think unarguable the first two should apply the change without de-dupping, since that's very clear that's what people would expect (their intent is not about the items they haven't specified). The last one is a little more controversial, but since i don't think we're going to allow them to apply duplicates, I think removing all the duplicates with the applied value is the "saner" option. Happy to have a discussion about it.

cc @alexzielenski @thockin

k8s-ci-robot commented 1 year ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: apelisse

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/kubernetes-sigs/structured-merge-diff/blob/master/OWNERS)~~ [apelisse] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
jpbetz commented 1 year ago

The last one is a little more controversial, but since i don't think we're going to allow them to apply duplicates, I think removing all the duplicates with the applied value is the "saner" option. Happy to have a discussion about it.

The last one can be further broken down into these cases:

  1. The duplicates are owned by a single field manager a. the apply uses the same field manager b. the apply uses a different field manager
  2. The duplicates are owned by different field managers a. the apply uses one of the field managers b. the apply uses a different field manager (I'll treat this the same as 1.b)

(1.a) clearly works with the proposed approach. I don't see that as being at all contentious.

For (1.b), I don't see any better alternative than the proposed solution. This is less clean cut because it would be possible to append instead of replace, but when I try to reason though what operations are possible, append seems less flexible.

(2.a) is the one that I'm least confident about with because the applier already owns one of the duplicates but not the other, so clobbering both makes me more uncomfortable. In the case of envs it would probably be safer to only replace the already owned duplicate, but that wouldn't be congruent with the replacement approach taken for (1.b)..

apelisse commented 1 year ago

OK, I guess I missed some context in here, but:

The last one can be further broken down into these cases:

There's only ever going to be 1 owner for all the duplicated fields (since the managedFields are sets, we can't really have more than one of them). Co-ownership can happen when someone applies the existing value, but I don't know about that. Also note that I'm pretty convince we should NOT allow people to apply duplicated fields, so we can't really have the horizontal dimension 1.a/2.a (the applier can't be one of them because since one can't apply duplicates, one of the duplicate can't be own by an applier). So it's hard to reason about your cases.

One thing that is fundamentally related though, and I'll update the tests for that is: What if what I apply is already set as one of the multiple fields. e.g. current is [{key: a, value: 1},{key: a, value: 2}], I apply [{key: a, value: 1}], should we keep [{key: a, value: 2}]? The corollary is: if the object is [{key: a, value: 2}], and I update to [{key: a, value: 1},{key: a, value: 2}], should we keep the previous owners?

For (1.b), I don't see any better alternative than the proposed solution. This is less clean cut because it would be possible to append instead of replace, but when I try to reason though what operations are possible, append seems less flexible.

"Append" is the current behavior so I think we have to keep this. if we wanted to replace, then yeah, we should make this atomic, but again, I don't think that's the right answer here.

apelisse commented 1 year ago

I'm adding cases for what we should have started with ... who owns what after "simple" changes 😅

apelisse commented 1 year ago

Added these tests because basically, we'll have to make two changes:

  1. The "Compare" function, that decides what has changed, and thus who owns what and what conflicts.
  2. The "Merge" function, that decides what the final outcome of applying looks like.

We could probably treat these two discussions as separate to make things simpler, agreeing on the first will almost certainly simplify the second.

jpbetz commented 1 year ago

There's only ever going to be 1 owner for all the duplicated fields

This simplifies things in my mind. Since we don't have multiple owners of the fields, there are less cases to consider. I don't know if this makes the outcome better, but it means we really don't have that many options on how to approach this. I'm onboard with replacing duplicates when merging for all cases given this.

apelisse commented 1 year ago

Cool, I think I can make the code work to have these tests pass.

k8s-ci-robot commented 1 year ago

@apelisse: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-structured-merge-diff-test 8f1b34aaa2561e36157c91e05e36b33a978b3843 link true /test pull-structured-merge-diff-test

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository. I understand the commands that are listed [here](https://go.k8s.io/bot-commands).
apelisse commented 1 year ago

OK I think I have a pretty exhaustive list of what I think the behavior should be. Some details are still partly unknown, I guess I'll reason about this as I write the code

k8s-ci-robot commented 11 months ago

PR needs rebase.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
apelisse commented 11 months ago

This one can be closed now, we've added the test.