k8snetworkplumbingwg / kubemacpool

Apache License 2.0
35 stars 33 forks source link

vm-pool, Reject VMs with duplicate interfaces #358

Closed RamLavi closed 2 years ago

RamLavi commented 2 years ago

What this PR does / why we need it: When applying a VM with multiple interfaces with the same name, the VM is rejected but kubemacpool is mishandling the MACs, causing a side effect where that MAC is no longer usable (it is "taken" by the ghost VM). This commit fixes this by checking and rejecting VMs with duplicate interface names on kmp webhook context.

Special notes for your reviewer:

Release note:

Kubemacpool webhook behavior is changed: virtualMachine with duplicate interface names will be rejected from now on.
RamLavi commented 2 years ago

/hold need to check pods module as well

RamLavi commented 2 years ago

/hold cancel pods are not relevant for this issue

maiqueb commented 2 years ago

I wonder about the approach altogether; afaiu, how the mutating webhook handles failures depend on its configuration - specifically, the value of its failurePolicy.

IIUC, this approach will only work if the failure policy is fail; since we own the webhook, that should not be an issue. Never the less, this might deserve a call out in the release notes: this approach is tied to the fail failure policy - which is the mutating webhook's default.

maiqueb commented 2 years ago

I wonder about the approach altogether; afaiu, how the mutating webhook handles failures depend on its configuration - specifically, the value of its failurePolicy.

IIUC, this approach will only work if the failure policy is fail; since we own the webhook, that should not be an issue. Never the less, this might deserve a call out in the release notes: this approach is tied to the fail failure policy - which is the mutating webhook's default.

/hold

The release notes should be updated indicating what is being fixed, and under what conditions - i.e. the MutatingWebhookConfiguration cannot be updated / must feature the fail failure policy.

RamLavi commented 2 years ago

I wonder about the approach altogether; afaiu, how the mutating webhook handles failures depend on its configuration - specifically, the value of its failurePolicy.

IIUC, this approach will only work if the failure policy is fail; since we own the webhook, that should not be an issue. Never the less, this might deserve a call out in the release notes: this approach is tied to the fail failure policy - which is the mutating webhook's default.

Well this webhook implementation has always relied on the fact that the failurePolicy is fail. nothing changed in that regard. Moreover, the check we are adding is already being done, only later, in the webhook validation phase. So, considering what you say, we have 2 options here:

  1. we can fail the VM on our webhook, saying we cannot deal with duplicate interface names, due to our current implementation (this is what we do in this PR)
  2. we can ignore the VM on our webhook (without assigning MACs, thus avoiding the side effect), letting the VM fail in the validation part later on.

IMO I'm not too excited about option 2. I prefer to fail instead of ignoring the issue.. what do you think @maiqueb @qinqon ?

qinqon commented 2 years ago

I wonder about the approach altogether; afaiu, how the mutating webhook handles failures depend on its configuration - specifically, the value of its failurePolicy. IIUC, this approach will only work if the failure policy is fail; since we own the webhook, that should not be an issue. Never the less, this might deserve a call out in the release notes: this approach is tied to the fail failure policy - which is the mutating webhook's default.

Well this webhook implementation has always relied on the fact that the failurePolicy is fail. nothing changed in that regard. Moreover, the check we are adding is already being done, only later, in the webhook validation phase. So, considering what you say, we have 2 options here:

  1. we can fail the VM on our webhook, saying we cannot deal with duplicate interface names, due to our current implementation (this is what we do in this PR)
  2. we can ignore the VM on our webhook (without assigning MACs, thus avoiding the side effect), letting the VM fail in the validation part later on.

IMO I'm not too excited about option 2. I prefer to fail instead of ignoring the issue.. what do you think @maiqueb @qinqon ?

Fail fast is always better.

maiqueb commented 2 years ago

I wonder about the approach altogether; afaiu, how the mutating webhook handles failures depend on its configuration - specifically, the value of its failurePolicy. IIUC, this approach will only work if the failure policy is fail; since we own the webhook, that should not be an issue. Never the less, this might deserve a call out in the release notes: this approach is tied to the fail failure policy - which is the mutating webhook's default.

Well this webhook implementation has always relied on the fact that the failurePolicy is fail. nothing changed in that regard. Moreover, the check we are adding is already being done, only later, in the webhook validation phase. So, considering what you say, we have 2 options here:

Right. Which means the mutating webhook mutates, and the validating webhook validates. So far so good.

  1. we can fail the VM on our webhook, saying we cannot deal with duplicate interface names, due to our current implementation (this is what we do in this PR)

This is making the mutating webhook validate. Which is something that - imo - belongs on the validating webhook. I won't pretend that I am versed enough in webhook design patterns to claim that validating stuff in the mutating webhook is an anti-pattern, but it surely feels semantically weird. That's why I advocated out of band for the reconcile loop to sort this inconsistency.

  1. we can ignore the VM on our webhook (without assigning MACs, thus avoiding the side effect), letting the VM fail in the validation part later on.

And let eventually the reconcile loop correct the state. IMO, this is more aligned with the kubernetes design model.

IMO I'm not too excited about option 2. I prefer to fail instead of ignoring the issue.. what do you think @maiqueb @qinqon ?

Fail fast is always better.

... Now the key thing is (quoting @RamLavi) :

Well this webhook implementation has always relied on the fact that the failurePolicy is fail. nothing changed in that regard.

Right, meaning whatever implicit rules I'm trying to uphold were already broken. No point arguing about them now; your proposed fix seems more in-line with the current design of this component.

Please detail the fix in the release notes - afaiu, you're changing the behavior of the webhook. Furthermore ... shouldn't you also delete this particular validation from the validating webhook ? It will be redundant now.

EDIT: I now realize the validating webhook mentioned above is part of KubeVirt's virt-api - meaning it can be used without kubemacpool. As such, it is not redundant.

kubevirt-bot commented 2 years ago

@maiqueb: changing LGTM is restricted to collaborators

In response to [this](https://github.com/k8snetworkplumbingwg/kubemacpool/pull/358#pullrequestreview-905476206): >Thank you for your patience. > Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
qinqon commented 2 years ago

/lgtm /approve

kubevirt-bot commented 2 years ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: qinqon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/k8snetworkplumbingwg/kubemacpool/blob/main/OWNERS)~~ [qinqon] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
RamLavi commented 2 years ago

/hold cancel