k8snetworkplumbingwg / ovs-cni

Open vSwitch CNI plugin
Apache License 2.0
224 stars 71 forks source link

resources, Update minimum resources request #177

Closed oshoval closed 3 years ago

oshoval commented 3 years ago

Update ovs-cni resources requests according to empiric use cases.

See https://bugzilla.redhat.com/show_bug.cgi?id=1935218

Signed-off-by: Or Shoval oshoval@redhat.com

kubevirt-bot commented 3 years ago

Hi @oshoval. Thanks for your PR.

I'm waiting for a k8snetworkplumbingwg member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
oshoval commented 3 years ago

@RamLavi need ok-to-test please

RamLavi commented 3 years ago

/ok-to-test

oshoval commented 3 years ago

/release-note-none

lane passed

RamLavi commented 3 years ago

/lgtm /approve

phoracek commented 3 years ago

/hold

oshoval commented 3 years ago

Don't we need this for the init container too?

There is a bug with metric-server to measure when there are initContainers, But I can put the same numbers of the pod 5m / 10Mi which is logically fine because the pod wont schedule if there are those at least, however it feels redundant for this cause, wdyt?

phoracek commented 3 years ago

Don't we need this for the init container too?

There is a bug with metric-server to measure when there are initContainers, But I can put the same numbers of the pod 5m / 10Mi which is logically fine because the pod wont schedule if there are those at least, however it feels redundant for this cause, wdyt?

The requirement for the init container is static, independent of cluster size, it always just copies the binary. Could you try to figure out what is the minimum needed by experimenting with limits?

oshoval commented 3 years ago

about initContainer:

  1. limit less than 5m / 10Mi (which is what the regular container uses) - doesn't load (Init:RunContainerError)

    Warning  Failed     86s                kubelet            Error: failed to start container "ovs-cni-plugin": Error response from daemon: OCI runtime create failed: container_linux.go:370: starting container process caused: process_linux.go:459: container init caused: read init-p: connection reset by peer: unknown
  2. limit exact as the regular container 5m / 10Mi Sometimes fails on Init:RunContainerError but eventually managed to start (not good enough of course).

  3. limit 10m / 15Mi seems ok. tested also with just limiting CPU 10m (and not limiting memory) another time - passed.

What bothers me, is that i am not sure we really need to mess with those numbers, because when the transient passes, the minimums will be already according the real container.

what do you prefer please ?

  1. To not put requests on initContainer
  2. to try seveal times to see 10m CPU limit (with or without also memory limit 10Mi or 15Mi) is good enough and then request it on the initContainer ? if so we might want to also put it on the real container, but not a must, just good for keeping it simple, and safer for both containers. Of course that if we add it to initContainer, it means that even the real container doesn't need it in the long run, it will be mandatory condition for scheduling.
phoracek commented 3 years ago

If we need 10m for the init container, but the only thing we set is 5m on the regular container, this Pod will be never able to start on a fully utilized host, no?

Can we then set 10m on the regular container and leave init container empty if you believe setting it is problematic?

oshoval commented 3 years ago

If we need 10m for the init container, but the only thing we set is 5m on the regular container, this Pod will be never able to start on a fully utilized host, no?

IIUC on a full utilized host setting initContainer to 10m is not our savior, because as we see in scenario 2 give it a chance with less and it will recover, and if we strict it to 10m minimum that host won't even give it a chance.

Can we then set 10m on the regular container and leave init container empty if you believe setting it is problematic?

I will set it to 10m on both, its not problematic that bad, just maybe redundant, 10m / 10Mi on both will take us to more safety and its not that much.

Thanks

phoracek commented 3 years ago

Let's not depend on the recovery, we should ask for enough resource to run comfortably. Let's do 10/10?

oshoval commented 3 years ago

Let's not depend on the recovery, we should ask for enough resource to run comfortably. Let's do 10/10?

yep on both, on it

oshoval commented 3 years ago

btw since multus and linux bridge are also a binary copiers, and also have 5m i will change them as well to 10m/10Mi at least right ?

we can consider that as well for macvtap initContainer

oshoval commented 3 years ago

Added requests to initContainer, and aligned the regular container to 10m/10Mi

oshoval commented 3 years ago

about the failure

+ ./cluster/kubectl.sh -n kube-system wait --for=condition=ready -l name=multus pod --timeout=300s
error: no matching resources found

multus not related to this PR

oshoval commented 3 years ago

changes: managed to break initialization with 10Mi memory limit so extended the initContainer to 15Mi (of course memory limit is not the same as cpu, one kill and the other throttle, but still)

phoracek commented 3 years ago

/lgtm /approve

kubevirt-bot commented 3 years ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: oshoval, phoracek, RamLavi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/k8snetworkplumbingwg/ovs-cni/blob/main/OWNERS)~~ [phoracek] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
oshoval commented 3 years ago

Thanks please remove the hold when you are fine with it

phoracek commented 3 years ago

/hold cancel

oshoval commented 3 years ago

/cherry-pick release-0.19

kubevirt-bot commented 3 years ago

@oshoval: only k8snetworkplumbingwg org members may request cherry picks. You can still do the cherry-pick manually.

In response to [this](https://github.com/k8snetworkplumbingwg/ovs-cni/pull/177#issuecomment-879135485): >/cherry-pick release-0.19 Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.