Closed dle-hpe closed 2 months ago
I'm pretty sure if you try to change topology managers, you have to wipe kubelet state? (@TimJones is it true?)
So it might not work on the fly, but the easiest clean test is to have those kubelet extraArgs
on the initial machine creation.
I'm pretty sure if you try to change topology managers, you have to wipe kubelet state?
i believe that's the case
Talos handles some of that, but not all, but it track kubelet configuration, not extraArgs
(extraConfig
in machine configuration terms).
The CPU manager should be handled, but not sure about other managers.
I mean changes to the topology managers are handled to wipe kubelet state, but nevertheless kubelet doesn't recommend changing it on the fly, as it won't affect correctly running pods.
We do not change the topology configs on the fly. its a wipe and redeploy.
Talos handles some of that, but not all, but it track kubelet configuration, not
extraArgs
(extraConfig
in machine configuration terms).The CPU manager should be handled, but not sure about other managers.
Do you mean adding both the CPUManager feature flag and CPUManager policy configs to extraConfig
?
Do you mean adding both the CPUManager feature flag and CPUManager policy configs to
extraConfig
?
This should not make a difference, but in general kubelet deprecates flags and prefers config.
But getting back to the issue, what makes you think this is a Talos Linux bug vs. the kubelet bug/misconfiguration?
Do you mean adding both the CPUManager feature flag and CPUManager policy configs to
extraConfig
?This should not make a difference, but in general kubelet deprecates flags and prefers config.
But getting back to the issue, what makes you think this is a Talos Linux bug vs. the kubelet bug/misconfiguration?
Kubelet sees the available resources, matches what cpu and memory requests and tries reserves the resources. When the resource gets reserved is when the error Resources cannot be allocated with Topology locality
comes up.
This seems to be a pretty "normal" error documented in Kubernetes docs.
We will remove Talos from the stack and use Ubuntu. Replicate the NUMA aware deployment.
Should I close this issue?
No, let's keep the issue, but my point is we can't really help just with the issue, as it's not fully reproducible (depends on the hardware and workloads), but you can use the steps describe in the Kubernetes docs above to troubleshoot it a bit further down the stack.
The issue I think is still valid to support changing other topology manager state without wiping the node completely.
So if something does or does not work with Ubuntu might give us some ideas, but probably it would be easier to figure out why in your case kubelet can't satisfy the constraints you're specifying.
E.g. you could do talosctl read /var/lib/kubelet/memory_manager_state
to see the memory state.
Ok cool. Will keep the issue open. We will remove Talos and use Ubuntu. Replicate everything else. Report back.
In regards to changing the topology manager settings. The host gets wiped because there is a state file with a checksum. Any topology changes will cause checksum to not be valid and kubelet will not start.
Any topology changes will cause checksum to not be valid and kubelet will not start.
that's what we have a workaround for CPU manager specifically (but not others).
@dle-hpe any updates on this one?
Still working on getting a test environment setup.
spent time to carefully verify in a few different scenarios and can confirm the limitation is being hit at the Kubevirt layer emulating NUMA domains causing issues with our workloads.
Bug Report
NUMA not fully functional when running NUMA aware workloads
Description
When enabling the following Kubelet feature flags: CPUMManager, MemoryManager and TopoloyManager. Enabling
topology-manager-policy: restricted
causes the deploy to fail with this errorResources cannot be allocated with Topology locality
If we change
topology-manager-policy: best-effort
. The deploy works. NUMA appears enabled. NUMA functionality does not work and blocks us from getting RDMA working with our GPUs and MLNX cards. We expected to see somePIX
instead of allPHB
connectionsLogs
support.zip
Environment
machineconfig with the install and kubelet details.
kubectl version --short
]x86_64