Open lyarwood opened 7 months ago
Just for context, guestMappingPassthrough
doesn't check for this case either:
./cluster-up/kubectl.sh apply -f -<<EOF
apiVersion: kubevirt.io/v1
kind: VirtualMachineInstance
metadata:
name: dedicated-threads-with-numa
spec:
domain:
cpu:
threads: 2
dedicatedCpuPlacement: true
numa:
guestMappingPassthrough: {}
memory:
guest: 1Gi
hugepages:
pageSize: "2Mi"
devices:
disks:
- disk:
bus: virtio
name: containerdisk
- disk:
bus: virtio
name: cloudinitdisk
resources:
requests:
memory: 2Gi
volumes:
- containerDisk:
image: quay.io/containerdisks/fedora:39
name: containerdisk
- cloudInitNoCloud:
userData: |
#!/bin/sh
mkdir -p /home/fedora/.ssh
curl https://github.com/lyarwood.keys > /home/fedora/.ssh/authorized_keys
chown -R fedora: /home/fedora/.ssh
name: cloudinitdisk
EOF
[..]
$ ./cluster-up/kubectl.sh get vmis
NAME AGE PHASE IP NODENAME READY
dedicated-threads-with-numa 12s Running 10.244.196.149 node01 True
Thanks @lyarwood.
Indeed, this is open to interpretation. So far, the general intent was to provide dedicated CPUs to the guest however, the topology assignment was on a best-efforts basis. The only exception was with guestMappingPassthrough
where we are strict about numa nodes assignment.
We can of course follow up and improve the correct behavior by further enhancing the dedicatedCPUs API. I would start by reporting whether SMP is enabled on the nodes.
We can of course follow up and improve the correct behavior by further enhancing the dedicatedCPUs API. I would start by reporting whether SMP is enabled on the nodes.
ACK thanks for confirming this is a valid thing to fix. It should be easy enough to label a node given the value in /sys/devices/system/cpu/smt/active
and to use that label when scheduling later if threads
is greater than 1. I'll try to find some time to work on this in the coming weeks.
/assign
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
/lifecycle stale
/remove-lifecycle stale
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
/lifecycle rotten
/cc @vladikr
What happened:
This is likely open to interpretation around the goals of
dedicatedCpuPlacement
but at present on hosts without SMT enabled vCPUs exposed as threads to the guest OS are pinned to non-sibling pCPUs. Users might be surprised by the performance of workloads across such threads given the request.What you expected to happen:
The
VirtualMachineInstance
to not schedule until a SMT enabled compute is present in the environment.How to reproduce it (as minimally and precisely as possible):
Uses https://github.com/kubevirt/kubevirtci/pull/1171
Additional context: N/A
Environment:
virtctl version
): N/Akubectl version
): N/Auname -a
): N/A