Open gavmckee80 opened 2 days ago
Can you try "rootDevices" instead of "pciAddresses". "pciAddresses" means only the devices that has these specific PCI address.
@rollandf I tried using the pfNames as follows
{
"resourceList": [
{
"resourceName": "asap2_vf",
"resourcePrefix": "nvidia.com",
"selectors": {
"vendors": [
"15b3"
],
"devices": [
"101e"
],
"drivers": [
"mlx5_core"
],
"pfNames": [
"ens1f0npf0vf#0-23","ens1f1npf1vf#0-23"
]
}
},
{
"resourceName": "asap2_vfio",
"resourcePrefix": "nvidia.com",
"selectors": {
"vendors": [
"15b3"
],
"devices": [
"101e"
],
"drivers": [
"vfio-pci","mlx5_core"
],
"pfNames": [
"ens1f0v#0-23","ens1f1v#0-23"
]
}
}
]
}
As a test I added both vfio and mlx5_core as drivers under the VF , all the VF on one phy interface are vfio bound the other are still mlx5_core. Even with that I still don't get any devices. A log attached and lspci output also
Just a quick follow up , when I remove the pfNames , I then see the resources being populated.
kubectl get node vaeq-cu2a-r113-lab-staging-hv-05.vaeq-lab-staging.infra.cx -o json | jq '.status.allocatable'
{
"cpu": "384",
"devices.kubevirt.io/kvm": "1k",
"devices.kubevirt.io/tun": "1k",
"devices.kubevirt.io/vhost-net": "1k",
"ephemeral-storage": "423821938396",
"hugepages-1Gi": "64Gi",
"hugepages-2Mi": "256Mi",
"memory": "1516845912Ki",
"nvidia.com/asap2_vf": "24",
"nvidia.com/asap2_vfio": "24",
"pods": "110"
}
Is it something in match pattern , you can see from the attached logs and output that the pfNames seem to be correct.
What happened?
Resources failed to be discovered on a node
Logs from the sriov plugin
What did you expect to happen?
I expected to be able to see mlnx_sriov_cx7 resources available
What are the minimal steps needed to reproduce the bug?
I suspect that the issue relates to this line in the log
Anything else we need to know?
I am using tuned to isolate CPU cores
Component Versions
Please fill in the below table with the version numbers of components used.
Config Files
Config file locations may be config dependent.
Device pool config file location (Try '/etc/pcidp/config.json')
Multus config (Try '/etc/cni/multus/net.d')
CNI config (Try '/etc/cni/net.d/')
Kubernetes deployment type ( Bare Metal, Kubeadm etc.)
Kubeconfig file
SR-IOV Network Custom Resource Definition
Logs
SR-IOV Network Device Plugin Logs (use
kubectl logs $PODNAME
)Multus logs (If enabled. Try '/var/log/multus.log' )
Kubelet logs (journalctl -u kubelet)