Open pacoxu opened 3 years ago
/cc @ehashman
Thanks for logging the issue.
I think it makes sense to remove the preflight check in the release when the feature goes beta. Checking the kubelet args / config for the FG is doable but a bit messy.
Actually, since we support kubelet n-1 skew, it should probably be done one release after beta.
Actually, since we support kubelet n-1 skew, it should probably be done one release after beta.
It makes sense. Hence, if it is beta in 1.23, kubeadm may add the support in 1.24+.
For users like me who want to try the alpha feature, does the preflight check of swap-off too harsh? The workaround is to add ignore flag in 1.22.
At least, the check should be removed in 1.23 when it’s beta in my opinions.
Or we may change the check error to a warning message?
Ok, in 1.23 we can switch it to warning. Remove it in 1.24.
looks like this is shifted to Beta for 1.24 due to some failures in CI and missing support in runtimes: https://docs.google.com/document/d/1Ne57gvidMEWXR70OxxnRkYquAoMpt56o75oZtg-OeBg/edit# (see notes for 26 Oct)
😓
However, changing SwapOn
to be a warning, not an error is valid.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
kep seems tracked for beta in 1.24: https://github.com/kubernetes/enhancements/issues/2400
update: looks like it was dropped from 1.24: https://github.com/kubernetes/enhancements/issues/2400#issuecomment-1068228077
update: looks like it was dropped from 1.24:
Most PRs are ready early in the v1.24 cycle. However, the e2e test can pass too late for v1.24. Some related PRs are still in review. Hope it can be beta in v1.25.
No update in v1.25 for swap feature as Elana is ooo.
Sergey will take the swap feature
in later releases. No update in v1.26 until now.
Sergey added it to v1.27 Plan and I will work on the swap cgroup v2 support part.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
https://github.com/kubernetes/kubernetes/pull/118764/ is promoting swap to beta in v1.28 which is in review now.
https://github.com/kubernetes/kubernetes/pull/118764 is merged. Swap is beta now. But the fail-on-swap=false
or failSwapOn
and swapBehavior should be set manually.
Swap Feature is beta1 in v1.28.
DualStack
to enable swap?failSwapOn
and swapBehavior
should be set if the FG is set to true in kubeadm init
.
I may submit a PR in 1.29 release cycle.
https://github.com/kubernetes/kubeadm/issues/2563#issuecomment-915184761
we can drop our preflight check in 1.28. we support n-1 kubelet but hopefully the user manages this specific skew/setup if they use the older kubelet. i don't think we should add a kubeadm FG for this.
kubernetes/kubernetes#118764 is merged. Swap is beta now. But the
fail-on-swap=false
orfailSwapOn
and swapBehavior should be set manually.
let's leave this manual. once the feature is GA we may need to update kubeadm docs to not mention these options, unless swap off is still recommended by default. the options will be no-op. IIUC
we can drop our preflight check in 1.28. we support n-1 kubelet but hopefully the user manages this specific skew/setup if they use the older kubelet.
Do you mean we should drop the warning in v1.29?
let's leave this manual. once the feature is GA we may need to update kubeadm docs to not mention these options, unless swap off is still recommended by default.
If so, todo items are dropping the warning and documenting it.
we can drop our preflight check in 1.28. we support n-1 kubelet but hopefully the user manages this specific skew/setup if they use the older kubelet.
Do you mean we should drop the warning in v1.29?
i may be missing context. to my understanding it's beta in 1.28. the FG is on by default but users must still manually apply the failSwapOn=false? we can drop the warning preflight check in 1.28 if we are sure the feature will become GA... or better we can wait until .29 or later until it graduates.
what is your recommendation?
let's leave this manual. once the feature is GA we may need to update kubeadm docs to not mention these options, unless swap off is still recommended by default.
If so, todo items are dropping the warning and documenting it.
we can remove the warning and clear our docs in terms of swap, but maybe keep the recommendation. for example, "swap is supported, but better keep it off".
It is still beta1 and we still have some tasks to make it Beta and then GA. So we may remove the warning 1.29 or even later.
NodeSwap
is in the beta1 stage, which means it is beta, by still false by default in v1.28. @iholder101 will work on a blog about it.
we can remove the warning and clear our docs in terms of swap, but maybe keep the recommendation. for example, "swap is supported, but better keep it off".
Please do make that change; v1.28 has already been released
We could also clarify that kubeadm doesn't yet support swap even though it's supported for manually-provisioned Linux nodes as beta.
Let me check next week. I will update it.
We have mentioned the steps to enable Swap with kubeadm in the blog: https://kubernetes.io/blog/2023/08/24/swap-linux-beta/#set-up-a-kubernetes-cluster-that-uses-swap-enabled-nodes.
Note that NodeSwap is supported for cgroup v2 only. For Kubernetes v1.28, using swap along with cgroup v1 is no longer supported.
I opened https://github.com/kubernetes/kubernetes/pull/120198 to update the warning as this is still disabled by default, I prefer to keep the warning.
Any thoughts on changing https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/ ? Currently it says that you MUST disable swap.
Any thoughts on changing https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/ ? Currently it says that you MUST disable swap.
I opened https://github.com/kubernetes/website/pull/42820 to explain more about swap configurations.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale i see beta for 1.30: https://github.com/kubernetes/enhancements/issues/2400#issuecomment-1912804417
v1.28: swap is supported for cgroup v2 only
if swap is supported for cgroup v2, why does kubeadm init/join fail with obscure errors on debian 12 bookworm where cgroupv2 is active/enabled in containerd configuration ( SystemdCgroup = true ) ?
i got only warning because of swap and i never thought that something would miserably fail on init because swap being active (had disabled it in etc/fstab, but didn't swapoff -a)
did cost me some hours today.
did cost me some hours today.
the kubeadm setup docs mention swap and the new feature gate: https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/
to your comment in the other ticket:
i'm 25yrs+ into linux/unix and i have NEVER seen something fail because swap being active, so this cost me a while to find out that this has blocked kubernetes installation/configuration
you can direct your annoyance to sig node, which is the group that maintains the kubelet component, where the feature has been missing since k8s epoch.
from the kubeadm docs:
the NodeSwap feature gate of the kubelet is beta but disabled by default.
that's not a good sign - basically an indication that the feature is not very stable yet (normally k8s beta features are on by default), thus turn swap off is our (kubeadm) recommendation.
thanks for the pointer. i'm fine when this is documented / mentioned somewhere in the docs and i'm also totally fine that swap needs to be disabled.
but it would be absolutely helpful especially for newbies, when pre-flight check would give a better hint.
the existing warning gives a false impression/advice, imho. you may assume that it's not that harmful in non-production/test envs. and as said in the other ticket, i cannot remember that i have seen something fail to setup/init because swap was enabled. i have only seen the opposite, i.e. some installer complained that swap needs to be enabled, regardless if needed at installation time or not.
i would NEVER have expected that an active swap would be installation/initialization blocker. i bet that 99 out of 100 linux admins also would not expect this, too.
" [WARNING Swap]: swap is enabled; production deployments should disable swap unless testing the NodeSwap feature gate of the kubelet"
FWIW, this ticket here is tracking the removal of the kubeadm preflight warning when NodeSwap becomes enabled by default. leaving the decision to @pacoxu whether the kubeadm warning should be updated for 1.30 and yes the wording can always be better, but if 1.30 enables the feature by default we are removing the preflight check entirely, from my understanding.
What we can do may be to return an error if the node is with cgroup v1 and swap on. Will this be more ambiguous?
yes, certainly.
but my system which cannot init or join when swap is active is on cgroup v2 (if is see this correctly) and i have also configured containerd appropriately
if i re-enable swap, drain the cluster node, reboot that and re-join, it reproducably hangs at the stage "[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap..."
root@kube3:~# cat /etc/containerd/config.toml|grep SystemdCgroup
SystemdCgroup = true
root@kube3:~# stat -fc %T /sys/fs/cgroup/
cgroup2fs
# uname -a
Linux kube3 6.1.0-17-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.69-1 (2023-12-30) x86_64 GNU/Linux
# cat /etc/debian_version
12.4
# mount|grep cg
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime)
## /usr/sbin/execsnoop-bpfcc -n runc
PCOMM PID PPID RET ARGS
runc 2320 2301 0 /usr/sbin/runc --root /run/containerd/runc/k8s.io --log /run/containerd/io.containerd.runtime.v2.task/k8s.io/959e05c2381225e3672196925b13ddb58531b0f58f6ade9c727
fb96066e711a0/log.json --log-format json --systemd-cgroup create --bundle /run/containerd/io.containerd.runtime.v2.task/k8s.io/959e05c2381225e3672196925b13ddb58531b0f58f6ade9c7
27fb96066e711a0 --pid-file /run/containerd/io.containerd.runtime.v2.task/k8s.io/959e05c2381225e3672196925b13ddb58531b0f58f6ade9c727fb96066e711a0/init.pid
959e05c2381225e3672196925b13ddb58531b0f58f6ade9c727fb96066e711a0
Hey all!
Some clarifications regarding the current status of NodeSwap in k8s:
NodeSwap
feature gate is on by default. However:
fail-on-swap=false
still needs to be provided to kubelet.NoSwap
, which means containers do not have swap access.IOW: in order to run k8s on a swap-enabled node there's a need to provide fail-on-swap=true
.
In order to actually give swap access to containers, the SwapBehavior needs to be set to LimitedSwap
(which is currently the only swap behavior supported other than NoSwap).
Regarding cgroups: Only cgroup v2 is supported for swap. cgroup v1 can be used with NoSwap, which explicitly sets swap limit as 0 at the cgroup level, but cannot be used with LimitedSwap (see https://github.com/kubernetes/kubernetes/pull/123738).
IMO it's safe to remove the error and not even replace it with a warning since to actually use swap the admin would need to explicitly change swap behavior, even if fail-on-swap=true
is provided to kubelet.
Please let me know if I can provide more information regarding this.
IOW: in order to run k8s on a swap-enabled node there's a need to provide fail-on-swap=true.
to avoid further complains from kubeadm users and additional logged tickets, i think we should keep the preflight check until the kubelet config is updated to not fail on swap by default.
is there a plan for that?
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
IOW: in order to run k8s on a swap-enabled node there's a need to provide fail-on-swap=true.
to avoid further complains from kubeadm users and additional logged tickets, i think we should keep the preflight check until the kubelet config is updated to not fail on swap by default.
is there a plan for that?
Hey @neolit123! As written here, the summary is:
NodeSwap
feature gate is on by default. However:
fail-on-swap=false
still needs to be provided to kubelet.NoSwap
, which means containers do not have swap access.So --fail-on-swap=false
is still necessary (and that's not going to change until swap GAs), but the default behavior is NoSwap
which means swap is inaccessible for k8s workloads by default.
Can we make sure that the installation docs here https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/ are changed so that swap doesn't have to be turned off with kubeadm?
@iholder101 would you be able to help us by pr- ing the docs?
@iholder101 would you be able to help us by pr- ing the docs?
Yeah I'd love to. I'll get to it shortly.
BTW, is there anything else required besides changing the docs?
On a second sight I see that @pacoxu already updated the docs here: https://github.com/kubernetes/website/pull/42820.
@neolit123 @pacoxu So, is there anything missing?
leaving this to @pacoxu to answer. the state of the noswap FG is still confusing to me, so hope we are clear in the docs and the preflight checks about it.
My update of the website is too general at that time.
Probably we should make it more clear of how to enable swap and use it in kubelet side.
In Beta2, the NodeSwap feature gate is on by default. However:
- fail-on-swap=false still needs to be provided to kubelet.
- The default "SwapBehavior" is NoSwap, which means containers do not have swap access.
This should be mentioned or we can link to the kubelet configuration details about swap to somewhere else which explained about the configurations of kubelet, including failOnSwap
and SwapBehavior
, and even the system reserve support.
@iholder101 @pacoxu should have https://github.com/kubernetes/website/pull/47710 closed this k/kubeadm issue or do we need to keep it for longer?
/reopen IIUC, we still need to remove the current preflight check warning in the future.
@pacoxu: Reopened this issue.
Is this a BUG REPORT or FEATURE REQUEST?
FEATURE REQUEST
/kind feature
Versions
kubeadm version (use
kubeadm version
): NodeSwap is alpha in 1.22 and will be beta1 in 1.28(still default disabled).What happened?
I tested NodeSwap on my nodes and when I re-install my env, I got error related to swap.
I think it's time to start planning for Swap enabling support on the kubeadm side.
What you expected to happen?
There should be NodeSwap support in
kubeadm init
and we can skip the check if the feature gate is enabled. Or in 1.23, we should skip the prelight check by default as it will be beta.How to reproduce it (as minimally and precisely as possible)?
swapon and run
kubeadm init
Anything else we need to know?
More details in https://github.com/kubernetes/enhancements/issues/2400
/assign