k8snetworkplumbingwg / sriov-network-device-plugin

SRIOV network device plugin for Kubernetes
Apache License 2.0
408 stars 177 forks source link

Capacity and Allocatable number shows wrong if sriov-network-device-plugin restarts #565

Closed jslouisyou closed 3 weeks ago

jslouisyou commented 5 months ago

What happened?

Node Capacity and Allocatable number shows wrong in case of restarting sriov-network-device-plugin if any pods attach SR-IOV IB VFs.

What did you expect to happen?

openshift.io/gpu_mlnx_ib# should be 8 in all VFs.

What are the minimal steps needed to reproduce the bug?

  1. Deploy sriov-network-operator version v1.2.0
  2. Create a Pod or Deployment
    apiVersion: apps/v1
    kind: Deployment
    metadata:
    name: sriov-testing-deployment-h100
    spec:
    replicas: 6
    selector:
    matchLabels:
      app: sriov-testing-h100
    template:
    metadata:
      annotations:
        k8s.v1.cni.cncf.io/networks: '[{"name": "sriov-gpu2-ib0", "interface": "net1"},
          {"name": "sriov-gpu2-ib1", "interface": "net2"}, {"name": "sriov-gpu2-ib2",
          "interface": "net3"}, {"name": "sriov-gpu2-ib3", "interface": "net4"}, {"name":
          "sriov-gpu2-ib4", "interface": "net5"}]'
      labels:
        app: sriov-testing-h100
      name: sriov-testing-pod
    spec:
      containers:
      - command:
        - sh
        - -c
        - sleep inf
        image: mellanox/tcpdump-rdma:latest
        imagePullPolicy: Always
        name: tcpdump-rdma
        resources:
          limits:
            openshift.io/gpu2_mlnx_ib0: "1"
            openshift.io/gpu2_mlnx_ib1: "1"
            openshift.io/gpu2_mlnx_ib2: "1"
            openshift.io/gpu2_mlnx_ib3: "1"
            openshift.io/gpu2_mlnx_ib4: "1"
          requests:
            openshift.io/gpu2_mlnx_ib0: "1"
            openshift.io/gpu2_mlnx_ib1: "1"
            openshift.io/gpu2_mlnx_ib2: "1"
            openshift.io/gpu2_mlnx_ib3: "1"
            openshift.io/gpu2_mlnx_ib4: "1"
        securityContext:
          capabilities:
            add:
            - IPC_LOCK
  3. Rollout sriov-device-plugin daemonset
    k rollout restart -n sriov-network-operator daemonset.apps/sriov-device-plugin
  4. Check whether Capacity and Allocatable shows full capacity or not

Anything else we need to know?

There were several issues already raised and commits were pushed, but it seems that this issue won't be fixed yet. xref) https://github.com/k8snetworkplumbingwg/sriov-network-device-plugin/issues/276, https://github.com/k8snetworkplumbingwg/sriov-network-device-plugin/issues/521

After restaring sriov-device-plugin, kubelet says that sriov-device-plugin pushed its state like below:

kubelet[1223475]: I0605 16:48:30.294908 1223475 manager.go:229] "Device plugin connected" resourceName="openshift.io/gpu_mlnx_ib0"
kubelet[1223475]: I0605 16:48:30.295508 1223475 client.go:91] "State pushed for device plugin" resource="openshift.io/gpu_mlnx_ib0" resourceCapacity=2
kubelet[1223475]: I0605 16:48:30.295721 1223475 http2_client.go:959] "[transport] [client-transport 0xc004920000] Closing: connection error: desc = \"error reading from server: read unix @->/var/lib/kubelet/plugins_registry/openshift.io_gpu_mlnx_ib0.sock: use of closed network connection\"\n"
kubelet[1223475]: I0605 16:48:30.298096 1223475 manager.go:278] "Processed device updates for resource" resourceName="openshift.io/gpu_mlnx_ib0" totalCount=2 healthyCount=2

Even if I changed image version of all components to latest, but this issue still occurs.

I'm using A100 and H100 nodes.

Component Versions

Please fill in the below table with the version numbers of components used.

Component Version
SR-IOV Network Device Plugin latest (I've also tested v3.5.1 and v3.7.0)
SR-IOV CNI Plugin latest (I've also tested sriovCni: v2.6.3 and ibSriovCni: v1.0.2)
Multus v3.8
Kubernetes v1.21.6, v1.28.3
OS ubuntu 20.04, 22.04

Config Files

Config file locations may be config dependent.

Device pool config file location (Try '/etc/pcidp/config.json')
Multus config (Try '/etc/cni/multus/net.d')
CNI config (Try '/etc/cni/net.d/')
Kubernetes deployment type ( Bare Metal, Kubeadm etc.)
Kubeconfig file
SR-IOV Network Custom Resource Definition

Logs

SR-IOV Network Device Plugin Logs (use kubectl logs $PODNAME)
Multus logs (If enabled. Try '/var/log/multus.log' )
Kubelet logs (journalctl -u kubelet)
SchSeba commented 5 months ago

Hi @jslouisyou can you share the device plugin configmap please

jslouisyou commented 5 months ago

Hi @SchSeba , I found there are 2 configmaps in sriov-network-operator namespace named as device-plugin-config and supported-nic-ids and here's the contents.

* `supported-nic-ids`

apiVersion: v1 data: Broadcom_bnxt_BCM57414_2x25G: 14e4 16d7 16dc Broadcom_bnxt_BCM75508_2x100G: 14e4 1750 1806 Intel_i40e_10G_X710_SFP: 8086 1572 154c Intel_i40e_25G_SFP28: 8086 158b 154c Intel_i40e_40G_XL710_QSFP: 8086 1583 154c Intel_i40e_XXV710: 8086 158a 154c Intel_i40e_XXV710_N3000: 8086 0d58 154c Intel_ice_Columbiaville_E810: 8086 1591 1889 Intel_ice_Columbiaville_E810-CQDA2_2CQDA2: 8086 1592 1889 Intel_ice_Columbiaville_E810-XXVDA2: 8086 159b 1889 Intel_ice_Columbiaville_E810-XXVDA4: 8086 1593 1889 Nvidia_mlx5_ConnectX-4: 15b3 1013 1014 Nvidia_mlx5_ConnectX-4LX: 15b3 1015 1016 Nvidia_mlx5_ConnectX-5: 15b3 1017 1018 Nvidia_mlx5_ConnectX-5_Ex: 15b3 1019 101a Nvidia_mlx5_ConnectX-6: 15b3 101b 101c Nvidia_mlx5_ConnectX-6_Dx: 15b3 101d 101e Nvidia_mlx5_ConnectX-7: 15b3 1021 101e Nvidia_mlx5_MT42822_BlueField-2_integrated_ConnectX-6_Dx: 15b3 a2d6 101e Qlogic_qede_QL45000_50G: 1077 1654 1664 Red_Hat_Virtio_network_device: 1af4 1000 1000 kind: ConfigMap metadata: annotations: meta.helm.sh/release-name: sriov-network-operator meta.helm.sh/release-namespace: sriov-network-operator creationTimestamp: "2024-06-05T05:29:22Z" labels: app.kubernetes.io/managed-by: Helm name: supported-nic-ids namespace: sriov-network-operator resourceVersion: "10770" uid: 15d5826e-2e56-4094-8a60-1567beda154b

SchSeba commented 4 months ago

Hi @jslouisyou if I remember right we introduce a check for

"linkTypes":["infiniband"]

that will run the PF so which should fix the problem if the wrong number of devices after the reboot can you please try the latest device plugin and let us know?

jslouisyou commented 4 months ago

Hi @SchSeba , I upgraded sriov-network-device-plugin to latest and tested again but this issue still occurs. Please let me know if I miss something, such as configuration or more.

SchSeba commented 3 months ago

Hi @jslouisyou can you please provide logs from

  1. start the device plugin no pods
  2. run pods using the device and restart the device plugin

Thanks!

jslouisyou commented 3 months ago

Hi @SchSeba Before this test, I changed all tags for images to latest and imagePullPolicy to Always in order to pull latest images.

<Internal Mirror Repository>/k8snetworkplumbingwg/sriov-cni:latest
<Internal Mirror Repository>/k8snetworkplumbingwg/ib-sriov-cni:latest
<Internal Mirror Repository>/k8snetworkplumbingwg/sriov-network-device-plugin:latest
<Internal Mirror Repository>/k8snetworkplumbingwg/network-resources-injector:latest
<Internal Mirror Repository>/k8snetworkplumbingwg/sriov-network-operator-config-daemon:latest
<Internal Mirror Repository>/k8snetworkplumbingwg/sriov-network-operator-webhook:latest
<Internal Mirror Repository>/k8snetworkplumbingwg/sriov-network-operator:latest

First, here's the log from sriov-device-plugin when it starts without any pods.

I0820 02:23:13.046505       1 manager.go:57] Using Kubelet Plugin Registry Mode
I0820 02:23:13.046555       1 main.go:46] resource manager reading configs
I0820 02:23:13.046575       1 manager.go:86] raw ResourceList: {"resourceList":[{"resourceName":"gpu2_mlnx_ib2","selectors":{"vendors":["15b3"],"devices":["101e"],"pfNames":["ibp157s0"],"linkTypes":["infiniband"],"IsRdma":true,"NeedVhostNet":false},"SelectorObj":null},{"resourceName":"gpu2_mlnx_ib3","selectors":{"vendors":["15b3"],"devices":["101e"],"pfNames":["ibp211s0"],"linkTypes":["infiniband"],"IsRdma":true,"NeedVhostNet":false},"SelectorObj":null}]}
I0820 02:23:13.046649       1 factory.go:211] *types.NetDeviceSelectors for resource gpu2_mlnx_ib2 is [0xc0004d6120]
I0820 02:23:13.046660       1 factory.go:211] *types.NetDeviceSelectors for resource gpu2_mlnx_ib3 is [0xc0004d6480]
I0820 02:23:13.046663       1 manager.go:106] unmarshalled ResourceList: [{ResourcePrefix: ResourceName:gpu2_mlnx_ib2 DeviceType:netDevice ExcludeTopology:false Selectors:0xc000324318 AdditionalInfo:map[] SelectorObjs:[0xc0004d6120]} {ResourcePrefix: ResourceName:gpu2_mlnx_ib3 DeviceType:netDevice ExcludeTopology:false Selectors:0xc000324330 AdditionalInfo:map[] SelectorObjs:[0xc0004d6480]}]
I0820 02:23:13.046698       1 manager.go:217] validating resource name "openshift.io/gpu2_mlnx_ib2"
I0820 02:23:13.046709       1 manager.go:217] validating resource name "openshift.io/gpu2_mlnx_ib3"
I0820 02:23:13.046712       1 main.go:62] Discovering host devices
I0820 02:23:13.129037       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:2a:00.0 02              Intel Corporation       Ethernet Controller X710 for 10GBASE-T  
I0820 02:23:13.129264       1 utils.go:494] excluding interface eno12399:  default route found: {Ifindex: 2 Dst: <nil> Src: <nil> Gw: 10.113.240.1 Flags: [] Table: 254 Realm: 0}
I0820 02:23:13.129298       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:2a:00.1 02              Intel Corporation       Ethernet Controller X710 for 10GBASE-T  
I0820 02:23:13.129407       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:2a:00.2 02              Intel Corporation       Ethernet Controller X710 for 10GBASE-T  
I0820 02:23:13.129507       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:2a:00.3 02              Intel Corporation       Ethernet Controller X710 for 10GBASE-T  
I0820 02:23:13.129591       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:41:00.0 02              Mellanox Technolo...    MT2910 Family [ConnectX-7]              
I0820 02:23:13.129699       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:54:00.0 02              Mellanox Technolo...    MT2910 Family [ConnectX-7]              
I0820 02:23:13.129823       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:9d:00.0 02              Mellanox Technolo...    MT2910 Family [ConnectX-7]              
I0820 02:23:13.131493       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:9d:00.1 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.131590       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:9d:00.2 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.131673       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:9d:00.3 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.131769       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:9d:00.4 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.131862       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:9d:00.5 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.131945       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:9d:00.6 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.132031       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:9d:00.7 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.132121       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:9d:01.0 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.132208       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:c1:00.0 02              Mellanox Technolo...    MT2910 Family [ConnectX-7]              
I0820 02:23:13.132298       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:d3:00.0 02              Mellanox Technolo...    MT2910 Family [ConnectX-7]              
I0820 02:23:13.133917       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:d3:00.1 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.134009       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:d3:00.2 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.134099       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:d3:00.3 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.134184       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:d3:00.4 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.134264       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:d3:00.5 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.134338       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:d3:00.6 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.134420       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:d3:00.7 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.134480       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:d3:01.0 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.134545       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:e5:00.0 02              Mellanox Technolo...    MT2910 Family [ConnectX-7]              
I0820 02:23:13.134644       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:2a:00.0   02              Intel Corporation       Ethernet Controller X710 for 10GBASE-T  
I0820 02:23:13.134651       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:2a:00.1   02              Intel Corporation       Ethernet Controller X710 for 10GBASE-T  
I0820 02:23:13.134654       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:2a:00.2   02              Intel Corporation       Ethernet Controller X710 for 10GBASE-T  
I0820 02:23:13.134656       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:2a:00.3   02              Intel Corporation       Ethernet Controller X710 for 10GBASE-T  
I0820 02:23:13.134658       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:41:00.0   02              Mellanox Technolo...    MT2910 Family [ConnectX-7]              
I0820 02:23:13.134661       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:54:00.0   02              Mellanox Technolo...    MT2910 Family [ConnectX-7]              
I0820 02:23:13.134665       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:9d:00.0   02              Mellanox Technolo...    MT2910 Family [ConnectX-7]              
I0820 02:23:13.134668       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:9d:00.1   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.134670       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:9d:00.2   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.134672       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:9d:00.3   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.134675       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:9d:00.4   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.134678       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:9d:00.5   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.134681       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:9d:00.6   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.134683       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:9d:00.7   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.134686       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:9d:01.0   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.134688       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:c1:00.0   02              Mellanox Technolo...    MT2910 Family [ConnectX-7]              
I0820 02:23:13.134691       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:d3:00.0   02              Mellanox Technolo...    MT2910 Family [ConnectX-7]              
I0820 02:23:13.134694       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:d3:00.1   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.134697       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:d3:00.2   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.134699       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:d3:00.3   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.134702       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:d3:00.4   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.134705       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:d3:00.5   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.134708       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:d3:00.6   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.134710       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:d3:00.7   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.134712       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:d3:01.0   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:23:13.134714       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:e5:00.0   02              Mellanox Technolo...    MT2910 Family [ConnectX-7]              
I0820 02:23:13.134718       1 main.go:68] Initializing resource servers
I0820 02:23:13.134723       1 manager.go:117] number of config: 2
I0820 02:23:13.134732       1 manager.go:121] Creating new ResourcePool: gpu2_mlnx_ib2
I0820 02:23:13.134736       1 manager.go:122] DeviceType: netDevice
W0820 02:23:13.134750       1 pciNetDevice.go:74] RDMA resources for 0000:2a:00.1 not found. Are RDMA modules loaded?
W0820 02:23:13.134933       1 pciNetDevice.go:74] RDMA resources for 0000:2a:00.2 not found. Are RDMA modules loaded?
W0820 02:23:13.135054       1 pciNetDevice.go:74] RDMA resources for 0000:2a:00.3 not found. Are RDMA modules loaded?
I0820 02:23:13.136791       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:9d:00.1. <nil>
I0820 02:23:13.137550       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:9d:00.2. <nil>
I0820 02:23:13.138329       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:9d:00.3. <nil>
I0820 02:23:13.138973       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:9d:00.4. <nil>
I0820 02:23:13.139728       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:9d:00.5. <nil>
I0820 02:23:13.140412       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:9d:00.6. <nil>
I0820 02:23:13.141141       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:9d:00.7. <nil>
I0820 02:23:13.141829       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:9d:01.0. <nil>
I0820 02:23:13.143062       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:d3:00.1. <nil>
I0820 02:23:13.143575       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:d3:00.2. <nil>
I0820 02:23:13.144136       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:d3:00.3. <nil>
I0820 02:23:13.144803       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:d3:00.4. <nil>
I0820 02:23:13.145521       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:d3:00.5. <nil>
I0820 02:23:13.146206       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:d3:00.6. <nil>
I0820 02:23:13.147007       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:d3:00.7. <nil>
I0820 02:23:13.147747       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:d3:01.0. <nil>
I0820 02:23:13.148651       1 manager.go:138] initServers(): selector index 0 will register 8 devices
I0820 02:23:13.148659       1 factory.go:124] device added: [identifier: 0000:9d:00.1, vendor: 15b3, device: 101e, driver: mlx5_core]
I0820 02:23:13.148662       1 factory.go:124] device added: [identifier: 0000:9d:00.2, vendor: 15b3, device: 101e, driver: mlx5_core]
I0820 02:23:13.148665       1 factory.go:124] device added: [identifier: 0000:9d:00.3, vendor: 15b3, device: 101e, driver: mlx5_core]
I0820 02:23:13.148667       1 factory.go:124] device added: [identifier: 0000:9d:00.4, vendor: 15b3, device: 101e, driver: mlx5_core]
I0820 02:23:13.148669       1 factory.go:124] device added: [identifier: 0000:9d:00.5, vendor: 15b3, device: 101e, driver: mlx5_core]
I0820 02:23:13.148671       1 factory.go:124] device added: [identifier: 0000:9d:00.6, vendor: 15b3, device: 101e, driver: mlx5_core]
I0820 02:23:13.148673       1 factory.go:124] device added: [identifier: 0000:9d:00.7, vendor: 15b3, device: 101e, driver: mlx5_core]
I0820 02:23:13.148675       1 factory.go:124] device added: [identifier: 0000:9d:01.0, vendor: 15b3, device: 101e, driver: mlx5_core]
I0820 02:23:13.148687       1 manager.go:156] New resource server is created for gpu2_mlnx_ib2 ResourcePool
I0820 02:23:13.148692       1 manager.go:121] Creating new ResourcePool: gpu2_mlnx_ib3
I0820 02:23:13.148694       1 manager.go:122] DeviceType: netDevice
W0820 02:23:13.148705       1 pciNetDevice.go:74] RDMA resources for 0000:2a:00.1 not found. Are RDMA modules loaded?
W0820 02:23:13.148848       1 pciNetDevice.go:74] RDMA resources for 0000:2a:00.2 not found. Are RDMA modules loaded?
W0820 02:23:13.148968       1 pciNetDevice.go:74] RDMA resources for 0000:2a:00.3 not found. Are RDMA modules loaded?
I0820 02:23:13.150473       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:9d:00.1. <nil>
I0820 02:23:13.151119       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:9d:00.2. <nil>
.....
I0820 02:23:13.160741       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:d3:00.7. <nil>
I0820 02:23:13.161432       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:d3:01.0. <nil>
I0820 02:23:13.162404       1 manager.go:138] initServers(): selector index 0 will register 8 devices
I0820 02:23:13.162414       1 factory.go:124] device added: [identifier: 0000:d3:00.1, vendor: 15b3, device: 101e, driver: mlx5_core]
I0820 02:23:13.162419       1 factory.go:124] device added: [identifier: 0000:d3:00.2, vendor: 15b3, device: 101e, driver: mlx5_core]
I0820 02:23:13.162422       1 factory.go:124] device added: [identifier: 0000:d3:00.3, vendor: 15b3, device: 101e, driver: mlx5_core]
I0820 02:23:13.162424       1 factory.go:124] device added: [identifier: 0000:d3:00.4, vendor: 15b3, device: 101e, driver: mlx5_core]
I0820 02:23:13.162426       1 factory.go:124] device added: [identifier: 0000:d3:00.5, vendor: 15b3, device: 101e, driver: mlx5_core]
I0820 02:23:13.162427       1 factory.go:124] device added: [identifier: 0000:d3:00.6, vendor: 15b3, device: 101e, driver: mlx5_core]
I0820 02:23:13.162429       1 factory.go:124] device added: [identifier: 0000:d3:00.7, vendor: 15b3, device: 101e, driver: mlx5_core]
I0820 02:23:13.162431       1 factory.go:124] device added: [identifier: 0000:d3:01.0, vendor: 15b3, device: 101e, driver: mlx5_core]
I0820 02:23:13.162446       1 manager.go:156] New resource server is created for gpu2_mlnx_ib3 ResourcePool
I0820 02:23:13.162451       1 main.go:74] Starting all servers...
I0820 02:23:13.162705       1 server.go:254] starting gpu2_mlnx_ib2 device plugin endpoint at: openshift.io_gpu2_mlnx_ib2.sock
I0820 02:23:13.162997       1 server.go:254] starting gpu2_mlnx_ib3 device plugin endpoint at: openshift.io_gpu2_mlnx_ib3.sock
I0820 02:23:13.163019       1 main.go:79] All servers started.
I0820 02:23:13.163023       1 main.go:80] Listening for term signals
I0820 02:23:13.871003       1 server.go:116] Plugin: openshift.io_gpu2_mlnx_ib3.sock gets registered successfully at Kubelet
I0820 02:23:13.870993       1 server.go:116] Plugin: openshift.io_gpu2_mlnx_ib2.sock gets registered successfully at Kubelet
I0820 02:23:13.871008       1 server.go:157] ListAndWatch(gpu2_mlnx_ib3) invoked
I0820 02:23:13.870984       1 server.go:157] ListAndWatch(gpu2_mlnx_ib2) invoked
I0820 02:23:13.871026       1 server.go:170] ListAndWatch(gpu2_mlnx_ib3): send devices &ListAndWatchResponse{Devices:[]*Device{&Device{ID:0000:d3:00.7,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:d3:01.0,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:d3:00.1,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:d3:00.2,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:d3:00.3,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:d3:00.4,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:d3:00.5,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:d3:00.6,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},},}
I0820 02:23:13.871061       1 server.go:170] ListAndWatch(gpu2_mlnx_ib2): send devices &ListAndWatchResponse{Devices:[]*Device{&Device{ID:0000:9d:00.7,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:9d:01.0,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:9d:00.1,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:9d:00.2,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:9d:00.3,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:9d:00.4,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:9d:00.5,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:9d:00.6,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},},}

Second, here's the log from sriov-device-plugin when any pod is started with using device

I0820 02:24:47.022655       1 server.go:125] Allocate() called with &AllocateRequest{ContainerRequests:[]*ContainerAllocateRequest{&ContainerAllocateRequest{DevicesIDs:[0000:9d:00.7],},},}
I0820 02:24:47.022684       1 pool_stub.go:108] GetEnvs(): for devices: [0000:9d:00.7]
I0820 02:24:47.022729       1 netResourcePool.go:49] GetDeviceSpecs(): for devices: [0000:9d:00.7]
I0820 02:24:47.022737       1 pool_stub.go:141] GetMounts(): for devices: [0000:9d:00.7]
I0820 02:24:47.022740       1 server.go:151] AllocateResponse send: &AllocateResponse{ContainerResponses:[]*ContainerAllocateResponse{&ContainerAllocateResponse{Envs:map[string]string{PCIDEVICE_OPENSHIFT_IO_GPU2_MLNX_IB2: 0000:9d:00.7,PCIDEVICE_OPENSHIFT_IO_GPU2_MLNX_IB2_INFO: {"0000:9d:00.7":{"generic":{"deviceID":"0000:9d:00.7"},"rdma":{"issm":"/dev/infiniband/issm12","rdma_cm":"/dev/infiniband/rdma_cm","umad":"/dev/infiniband/umad12","uverbs":"/dev/infiniband/uverbs12"}}},},Mounts:[]*Mount{},Devices:[]*DeviceSpec{&DeviceSpec{ContainerPath:/dev/infiniband/issm12,HostPath:/dev/infiniband/issm12,Permissions:rw,},&DeviceSpec{ContainerPath:/dev/infiniband/umad12,HostPath:/dev/infiniband/umad12,Permissions:rw,},&DeviceSpec{ContainerPath:/dev/infiniband/uverbs12,HostPath:/dev/infiniband/uverbs12,Permissions:rw,},&DeviceSpec{ContainerPath:/dev/infiniband/rdma_cm,HostPath:/dev/infiniband/rdma_cm,Permissions:rw,},},Annotations:map[string]string{},CDIDevices:[]*CDIDevice{},},},}
I0820 02:24:47.023028       1 server.go:125] Allocate() called with &AllocateRequest{ContainerRequests:[]*ContainerAllocateRequest{&ContainerAllocateRequest{DevicesIDs:[0000:d3:01.0],},},}
I0820 02:24:47.023070       1 pool_stub.go:108] GetEnvs(): for devices: [0000:d3:01.0]
I0820 02:24:47.023106       1 netResourcePool.go:49] GetDeviceSpecs(): for devices: [0000:d3:01.0]
I0820 02:24:47.023115       1 pool_stub.go:141] GetMounts(): for devices: [0000:d3:01.0]
I0820 02:24:47.023121       1 server.go:151] AllocateResponse send: &AllocateResponse{ContainerResponses:[]*ContainerAllocateResponse{&ContainerAllocateResponse{Envs:map[string]string{PCIDEVICE_OPENSHIFT_IO_GPU2_MLNX_IB3: 0000:d3:01.0,PCIDEVICE_OPENSHIFT_IO_GPU2_MLNX_IB3_INFO: {"0000:d3:01.0":{"generic":{"deviceID":"0000:d3:01.0"},"rdma":{"issm":"/dev/infiniband/issm21","rdma_cm":"/dev/infiniband/rdma_cm","umad":"/dev/infiniband/umad21","uverbs":"/dev/infiniband/uverbs21"}}},},Mounts:[]*Mount{},Devices:[]*DeviceSpec{&DeviceSpec{ContainerPath:/dev/infiniband/issm21,HostPath:/dev/infiniband/issm21,Permissions:rw,},&DeviceSpec{ContainerPath:/dev/infiniband/umad21,HostPath:/dev/infiniband/umad21,Permissions:rw,},&DeviceSpec{ContainerPath:/dev/infiniband/uverbs21,HostPath:/dev/infiniband/uverbs21,Permissions:rw,},&DeviceSpec{ContainerPath:/dev/infiniband/rdma_cm,HostPath:/dev/infiniband/rdma_cm,Permissions:rw,},},Annotations:map[string]string{},CDIDevices:[]*CDIDevice{},},},}

Third, here's the log from sriov-device-plugin when it restarts.

I0820 02:25:23.763525       1 main.go:87] Received signal "terminated", shutting down.
I0820 02:25:23.764141       1 server.go:308] stopping gpu2_mlnx_ib2 device plugin server...
I0820 02:25:23.764178       1 server.go:182] ListAndWatch(gpu2_mlnx_ib2): terminate signal received
I0820 02:25:23.764504       1 server.go:308] stopping gpu2_mlnx_ib3 device plugin server...
I0820 02:25:23.764859       1 server.go:182] ListAndWatch(gpu2_mlnx_ib3): terminate signal received
--- restarts ---
I0820 02:25:24.945536       1 manager.go:57] Using Kubelet Plugin Registry Mode
I0820 02:25:24.945578       1 main.go:46] resource manager reading configs
I0820 02:25:24.945598       1 manager.go:86] raw ResourceList: {"resourceList":[{"resourceName":"gpu2_mlnx_ib2","selectors":{"vendors":["15b3"],"devices":["101e"],"pfNames":["ibp157s0"],"linkTypes":["infiniband"],"IsRdma":true,"NeedVhostNet":false},"SelectorObj":null},{"resourceName":"gpu2_mlnx_ib3","selectors":{"vendors":["15b3"],"devices":["101e"],"pfNames":["ibp211s0"],"linkTypes":["infiniband"],"IsRdma":true,"NeedVhostNet":false},"SelectorObj":null}]}
I0820 02:25:24.945656       1 factory.go:211] *types.NetDeviceSelectors for resource gpu2_mlnx_ib2 is [0xc0004a17a0]
I0820 02:25:24.945666       1 factory.go:211] *types.NetDeviceSelectors for resource gpu2_mlnx_ib3 is [0xc0004a1b00]
I0820 02:25:24.945669       1 manager.go:106] unmarshalled ResourceList: [{ResourcePrefix: ResourceName:gpu2_mlnx_ib2 DeviceType:netDevice ExcludeTopology:false Selectors:0xc00032e348 AdditionalInfo:map[] SelectorObjs:[0xc0004a17a0]} {ResourcePrefix: ResourceName:gpu2_mlnx_ib3 DeviceType:netDevice ExcludeTopology:false Selectors:0xc00032e360 AdditionalInfo:map[] SelectorObjs:[0xc0004a1b00]}]
I0820 02:25:24.945701       1 manager.go:217] validating resource name "openshift.io/gpu2_mlnx_ib2"
I0820 02:25:24.945712       1 manager.go:217] validating resource name "openshift.io/gpu2_mlnx_ib3"
I0820 02:25:24.945714       1 main.go:62] Discovering host devices
I0820 02:25:25.031561       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:2a:00.0   02              Intel Corporation       Ethernet Controller X710 for 10GBASE-T  
I0820 02:25:25.031592       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:2a:00.1   02              Intel Corporation       Ethernet Controller X710 for 10GBASE-T  
I0820 02:25:25.031596       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:2a:00.2   02              Intel Corporation       Ethernet Controller X710 for 10GBASE-T  
I0820 02:25:25.031599       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:2a:00.3   02              Intel Corporation       Ethernet Controller X710 for 10GBASE-T  
I0820 02:25:25.031602       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:41:00.0   02              Mellanox Technolo...    MT2910 Family [ConnectX-7]              
I0820 02:25:25.031605       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:54:00.0   02              Mellanox Technolo...    MT2910 Family [ConnectX-7]              
I0820 02:25:25.031611       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:9d:00.0   02              Mellanox Technolo...    MT2910 Family [ConnectX-7]              
I0820 02:25:25.031614       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:9d:00.1   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.031617       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:9d:00.2   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.031619       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:9d:00.3   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.031621       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:9d:00.4   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.031623       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:9d:00.5   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.031625       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:9d:00.6   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.031629       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:9d:00.7   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.031630       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:9d:01.0   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.031633       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:c1:00.0   02              Mellanox Technolo...    MT2910 Family [ConnectX-7]              
I0820 02:25:25.031635       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:d3:00.0   02              Mellanox Technolo...    MT2910 Family [ConnectX-7]              
I0820 02:25:25.031637       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:d3:00.1   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.031639       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:d3:00.2   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.031641       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:d3:00.3   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.031644       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:d3:00.4   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.031645       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:d3:00.5   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.031648       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:d3:00.6   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.031650       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:d3:00.7   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.031653       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:d3:01.0   02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.031655       1 auxNetDeviceProvider.go:84] auxnetdevice AddTargetDevices(): device found: 0000:e5:00.0   02              Mellanox Technolo...    MT2910 Family [ConnectX-7]              
I0820 02:25:25.031660       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:2a:00.0 02              Intel Corporation       Ethernet Controller X710 for 10GBASE-T  
I0820 02:25:25.031871       1 utils.go:494] excluding interface eno12399:  default route found: {Ifindex: 2 Dst: <nil> Src: <nil> Gw: 10.113.240.1 Flags: [] Table: 254 Realm: 0}
I0820 02:25:25.031893       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:2a:00.1 02              Intel Corporation       Ethernet Controller X710 for 10GBASE-T  
I0820 02:25:25.032001       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:2a:00.2 02              Intel Corporation       Ethernet Controller X710 for 10GBASE-T  
I0820 02:25:25.032102       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:2a:00.3 02              Intel Corporation       Ethernet Controller X710 for 10GBASE-T  
I0820 02:25:25.032184       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:41:00.0 02              Mellanox Technolo...    MT2910 Family [ConnectX-7]              
I0820 02:25:25.032296       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:54:00.0 02              Mellanox Technolo...    MT2910 Family [ConnectX-7]              
I0820 02:25:25.032404       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:9d:00.0 02              Mellanox Technolo...    MT2910 Family [ConnectX-7]              
I0820 02:25:25.034023       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:9d:00.1 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.034118       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:9d:00.2 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.034209       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:9d:00.3 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.034309       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:9d:00.4 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.034408       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:9d:00.5 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.034497       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:9d:00.6 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.034584       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:9d:00.7 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.034599       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:9d:01.0 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.034688       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:c1:00.0 02              Mellanox Technolo...    MT2910 Family [ConnectX-7]              
I0820 02:25:25.034782       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:d3:00.0 02              Mellanox Technolo...    MT2910 Family [ConnectX-7]              
I0820 02:25:25.036327       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:d3:00.1 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.036415       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:d3:00.2 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.036514       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:d3:00.3 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.036596       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:d3:00.4 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.036673       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:d3:00.5 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.036757       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:d3:00.6 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.036838       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:d3:00.7 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.036907       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:d3:01.0 02              Mellanox Technolo...    ConnectX Family mlx5Gen Virtual Function
I0820 02:25:25.036922       1 netDeviceProvider.go:67] netdevice AddTargetDevices(): device found: 0000:e5:00.0 02              Mellanox Technolo...    MT2910 Family [ConnectX-7]              
I0820 02:25:25.037021       1 main.go:68] Initializing resource servers
I0820 02:25:25.037026       1 manager.go:117] number of config: 2
I0820 02:25:25.037033       1 manager.go:121] Creating new ResourcePool: gpu2_mlnx_ib2
I0820 02:25:25.037036       1 manager.go:122] DeviceType: netDevice
W0820 02:25:25.037049       1 pciNetDevice.go:74] RDMA resources for 0000:2a:00.1 not found. Are RDMA modules loaded?
W0820 02:25:25.037222       1 pciNetDevice.go:74] RDMA resources for 0000:2a:00.2 not found. Are RDMA modules loaded?
W0820 02:25:25.037341       1 pciNetDevice.go:74] RDMA resources for 0000:2a:00.3 not found. Are RDMA modules loaded?
I0820 02:25:25.039103       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:9d:00.1. <nil>
I0820 02:25:25.039869       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:9d:00.2. <nil>
I0820 02:25:25.040615       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:9d:00.3. <nil>
I0820 02:25:25.041274       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:9d:00.4. <nil>
I0820 02:25:25.042018       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:9d:00.5. <nil>
I0820 02:25:25.042695       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:9d:00.6. <nil>
W0820 02:25:25.042971       1 pciNetDevice.go:74] RDMA resources for 0000:9d:00.7 not found. Are RDMA modules loaded?
I0820 02:25:25.043028       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:9d:00.7. <nil>
I0820 02:25:25.044580       1 pciNetDevice.go:106] getPKey(): unable to get PKey for device 0000:9d:00.7 : "infiniband directory is empty for device: 0000:9d:00.7"
I0820 02:25:25.044978       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:9d:01.0. <nil>
I0820 02:25:25.046205       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:d3:00.1. <nil>
I0820 02:25:25.046720       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:d3:00.2. <nil>
I0820 02:25:25.047275       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:d3:00.3. <nil>
I0820 02:25:25.047925       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:d3:00.4. <nil>
I0820 02:25:25.048634       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:d3:00.5. <nil>
I0820 02:25:25.049280       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:d3:00.6. <nil>
I0820 02:25:25.050058       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:d3:00.7. <nil>
W0820 02:25:25.050348       1 pciNetDevice.go:74] RDMA resources for 0000:d3:01.0 not found. Are RDMA modules loaded?
I0820 02:25:25.050405       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:d3:01.0. <nil>
I0820 02:25:25.052074       1 pciNetDevice.go:106] getPKey(): unable to get PKey for device 0000:d3:01.0 : "infiniband directory is empty for device: 0000:d3:01.0"
I0820 02:25:25.052681       1 manager.go:138] initServers(): selector index 0 will register 7 devices
I0820 02:25:25.052688       1 factory.go:124] device added: [identifier: 0000:9d:00.1, vendor: 15b3, device: 101e, driver: mlx5_core]
I0820 02:25:25.052691       1 factory.go:124] device added: [identifier: 0000:9d:00.2, vendor: 15b3, device: 101e, driver: mlx5_core]
I0820 02:25:25.052694       1 factory.go:124] device added: [identifier: 0000:9d:00.3, vendor: 15b3, device: 101e, driver: mlx5_core]
I0820 02:25:25.052696       1 factory.go:124] device added: [identifier: 0000:9d:00.4, vendor: 15b3, device: 101e, driver: mlx5_core]
I0820 02:25:25.052698       1 factory.go:124] device added: [identifier: 0000:9d:00.5, vendor: 15b3, device: 101e, driver: mlx5_core]
I0820 02:25:25.052700       1 factory.go:124] device added: [identifier: 0000:9d:00.6, vendor: 15b3, device: 101e, driver: mlx5_core]
I0820 02:25:25.052702       1 factory.go:124] device added: [identifier: 0000:9d:01.0, vendor: 15b3, device: 101e, driver: mlx5_core]
I0820 02:25:25.052713       1 manager.go:156] New resource server is created for gpu2_mlnx_ib2 ResourcePool
I0820 02:25:25.052717       1 manager.go:121] Creating new ResourcePool: gpu2_mlnx_ib3
I0820 02:25:25.052719       1 manager.go:122] DeviceType: netDevice
W0820 02:25:25.052728       1 pciNetDevice.go:74] RDMA resources for 0000:2a:00.1 not found. Are RDMA modules loaded?
W0820 02:25:25.052875       1 pciNetDevice.go:74] RDMA resources for 0000:2a:00.2 not found. Are RDMA modules loaded?
W0820 02:25:25.052991       1 pciNetDevice.go:74] RDMA resources for 0000:2a:00.3 not found. Are RDMA modules loaded?
I0820 02:25:25.054482       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:9d:00.1. <nil>
I0820 02:25:25.055114       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:9d:00.2. <nil>
.....
W0820 02:25:25.058073       1 pciNetDevice.go:74] RDMA resources for 0000:9d:00.7 not found. Are RDMA modules loaded?
I0820 02:25:25.058129       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:9d:00.7. <nil>
I0820 02:25:25.059696       1 pciNetDevice.go:106] getPKey(): unable to get PKey for device 0000:9d:00.7 : "infiniband directory is empty for device: 0000:9d:00.7"
.....
I0820 02:25:25.066066       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:d3:00.6. <nil>
I0820 02:25:25.066898       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:d3:00.7. <nil>
W0820 02:25:25.067208       1 pciNetDevice.go:74] RDMA resources for 0000:d3:01.0 not found. Are RDMA modules loaded?
I0820 02:25:25.067271       1 utils.go:82] Devlink query for eswitch mode is not supported for device 0000:d3:01.0. <nil>
I0820 02:25:25.069033       1 pciNetDevice.go:106] getPKey(): unable to get PKey for device 0000:d3:01.0 : "infiniband directory is empty for device: 0000:d3:01.0"
I0820 02:25:25.069706       1 manager.go:138] initServers(): selector index 0 will register 7 devices
I0820 02:25:25.069716       1 factory.go:124] device added: [identifier: 0000:d3:00.1, vendor: 15b3, device: 101e, driver: mlx5_core]
I0820 02:25:25.069720       1 factory.go:124] device added: [identifier: 0000:d3:00.2, vendor: 15b3, device: 101e, driver: mlx5_core]
I0820 02:25:25.069723       1 factory.go:124] device added: [identifier: 0000:d3:00.3, vendor: 15b3, device: 101e, driver: mlx5_core]
I0820 02:25:25.069725       1 factory.go:124] device added: [identifier: 0000:d3:00.4, vendor: 15b3, device: 101e, driver: mlx5_core]
I0820 02:25:25.069727       1 factory.go:124] device added: [identifier: 0000:d3:00.5, vendor: 15b3, device: 101e, driver: mlx5_core]
I0820 02:25:25.069730       1 factory.go:124] device added: [identifier: 0000:d3:00.6, vendor: 15b3, device: 101e, driver: mlx5_core]
I0820 02:25:25.069733       1 factory.go:124] device added: [identifier: 0000:d3:00.7, vendor: 15b3, device: 101e, driver: mlx5_core]
I0820 02:25:25.069751       1 manager.go:156] New resource server is created for gpu2_mlnx_ib3 ResourcePool
I0820 02:25:25.069756       1 main.go:74] Starting all servers...
I0820 02:25:25.069999       1 server.go:254] starting gpu2_mlnx_ib2 device plugin endpoint at: openshift.io_gpu2_mlnx_ib2.sock
I0820 02:25:25.070288       1 server.go:254] starting gpu2_mlnx_ib3 device plugin endpoint at: openshift.io_gpu2_mlnx_ib3.sock
I0820 02:25:25.070310       1 main.go:79] All servers started.
I0820 02:25:25.070315       1 main.go:80] Listening for term signals
I0820 02:25:25.953494       1 server.go:157] ListAndWatch(gpu2_mlnx_ib3) invoked
I0820 02:25:25.953515       1 server.go:116] Plugin: openshift.io_gpu2_mlnx_ib3.sock gets registered successfully at Kubelet
I0820 02:25:25.953527       1 server.go:157] ListAndWatch(gpu2_mlnx_ib2) invoked
I0820 02:25:25.953533       1 server.go:116] Plugin: openshift.io_gpu2_mlnx_ib2.sock gets registered successfully at Kubelet
I0820 02:25:25.953523       1 server.go:170] ListAndWatch(gpu2_mlnx_ib3): send devices &ListAndWatchResponse{Devices:[]*Device{&Device{ID:0000:d3:00.4,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:d3:00.5,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:d3:00.6,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:d3:00.7,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:d3:00.1,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:d3:00.2,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:d3:00.3,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},},}
I0820 02:25:25.953557       1 server.go:170] ListAndWatch(gpu2_mlnx_ib2): send devices &ListAndWatchResponse{Devices:[]*Device{&Device{ID:0000:9d:00.4,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:9d:00.5,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:9d:00.6,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:9d:01.0,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:9d:00.1,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:9d:00.2,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:9d:00.3,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},},}

After restarts sriov-device-plugin, kubelet reports wrong number of Capacity and Allocatable, which is 7 but 8 is appropriate.

Capacity:
.....
  nvidia.com/gpu:              8
  openshift.io/gpu2_mlnx_ib2:  7
  openshift.io/gpu2_mlnx_ib3:  7
Allocatable:
.....
  nvidia.com/gpu:              8
  openshift.io/gpu2_mlnx_ib2:  7
  openshift.io/gpu2_mlnx_ib3:  7

I hope this might helps!

adrianchiris commented 3 months ago

@jslouisyou i see the following after kubelet restart in device plugin logs:

W0820 02:25:25.058073       1 pciNetDevice.go:74] RDMA resources for 0000:9d:00.7 not found. Are RDMA modules loaded?
W0820 02:25:25.067208       1 pciNetDevice.go:74] RDMA resources for 0000:d3:01.0 not found. Are RDMA modules loaded?

are these VFs currently assigned to pods ?

what is the SriovIBNetwork you have defined ? can you also provide the matching network-attachment-definition used for the workloads pods ?

on the worker node, can you run the following command as root: rdma system what is the output ?

jslouisyou commented 3 months ago

Hi @adrianchiris !

  1. Yes, after sriov-device-plugin pod restarts, I can find network interfaces are attached in Pods. Here's the result of ifconfig in Pod:
    
    net3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 4092
        inet 192.168.224.2  netmask 255.255.240.0  broadcast 192.168.239.255
        inet6 fe80::1281:33fc:ce4f:96e  prefixlen 64  scopeid 0x20<link>
        unspec 00-00-01-AF-FE-80-00-00-00-00-00-00-00-00-00-00  txqueuelen 256  (UNSPEC)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 11  bytes 852 (852.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

net4: flags=4099<UP,BROADCAST,MULTICAST> mtu 4092 inet 192.168.240.2 netmask 255.255.240.0 broadcast 192.168.255.255 unspec 00-00-01-6F-FE-80-00-00-00-00-00-00-00-00-00-00 txqueuelen 256 (UNSPEC) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

and this is sample Pod manifest:

apiVersion: apps/v1 kind: DaemonSet metadata: name: sriov-test labels: app: sriov-test spec: selector: matchLabels: app: sriov-test template: metadata: labels: app: sriov-test annotations: k8s.v1.cni.cncf.io/networks: '[ {"name": "sriov-gpu2-ib2", "interface": "net3"}, {"name": "sriov-gpu2-ib3", "interface": "net4"} ]' spec: nodeSelector: nvidia.com/gpu.product: NVIDIA-H100-80GB-HBM3 tolerations:

  1. BTW There aren't any resources for sriovibnetworks.sriovnetwork.openshift.io. Are these essential? It's been working fine without this until now.
    $ k get sriovibnetworks.sriovnetwork.openshift.io -A
    No resources found

    Here's the network-attachment-definition resources for above 2 IB VFs:

    - apiVersion: k8s.cni.cncf.io/v1
    kind: NetworkAttachmentDefinition
    metadata:
    annotations:
      k8s.v1.cni.cncf.io/resourceName: openshift.io/gpu2_mlnx_ib2
    name: sriov-gpu2-ib2
    namespace: default
    spec:
    config: |-
      {
        "cniVersion": "0.3.1",
        "name": "sriov_gpu2_ib2",
        "plugins": [
          {
            "type": "ib-sriov",
            "link_state": "enable",
            "rdmaIsolation": true,
            "ibKubernetesEnabled": false
            "ipam": 
              "datastore": "kubernetes",
              "kubernetes": {
                "kubeconfig": "/etc/cni/net.d/whereabouts.d/whereabouts.kubeconfig"
              },
              "log_file": "/tmp/whereabouts.log",
              "log_level": "debug",
             "type": "whereabouts",
              "range": "192.168.224.0/20"
            }
          }
        ]
      }
    ---
    - apiVersion: k8s.cni.cncf.io/v1
    kind: NetworkAttachmentDefinition
    metadata:
    annotations:
      k8s.v1.cni.cncf.io/resourceName: openshift.io/gpu2_mlnx_ib3
    name: sriov-gpu2-ib3
    namespace: default
    spec:
    config: |-
      {
        "cniVersion": "0.3.1",
        "name": "sriov_gpu2_ib3",
        "plugins": [
          {
            "type": "ib-sriov",
            "link_state": "enable",
            "rdmaIsolation": true,
            "ibKubernetesEnabled": false,
            "ipam": {
              "datastore": "kubernetes",
              "kubernetes": {
                "kubeconfig": "/etc/cni/net.d/whereabouts.d/whereabouts.kubeconfig"
              },
              "log_file": "/tmp/whereabouts.log",
              "log_level": "debug",
              "type": "whereabouts",
              "range": "192.168.240.0/20"
            }
          }
        ]
      }
  2. Here's the result rdma system in worker nodes (same results):
    $ rdma system
    netns exclusive copy-on-fork on
adrianchiris commented 3 months ago

BTW There aren't any resources for sriovibnetworks.sriovnetwork.openshift.io. Are these essential? It's been working fine without this until now.

if you define network-attachment-definition separately its not required.

i see the system is configured with rdma system exclusive mode so sriov-device-plugin will not find rdma resources that are assigned to container -> it will not register that device (VF) to the pool.

we will need to modify device plugin to handle this case.

  1. compute RDMA resources on Allocate() call
  2. understand if device is RDMA enabled during device discovery with fallback to checking RDMA on PF. 2.1 or use devlink dev param check if enable_rdma value is true <- my preference as it could be possible to disable RDMA for specific devices

for now prehaps use rdma in shared mode if possible.

jslouisyou commented 3 months ago

Thanks @adrianchiris for your quick response!

The RDMA mode wasn't configured by me and given that a large number of GPU devices are currently utilizing RDMA so it seems it would be challenging to modify the mode without complications.

At this point, can I consider this issue as occurred from sriov-device-plugin right?

adrianchiris commented 3 months ago

At this point, can I consider this issue as occurred from sriov-device-plugin right?

yes

jslouisyou commented 3 months ago

Thanks!

It might be very early to ask, but are there any future plans to resolve this issue?

adrianchiris commented 3 months ago

yes it will be addressed in the near future. i dont have an ETA atm

rollandf commented 3 months ago

I will take a look at it.

SchSeba commented 1 month ago

Hi @rollandf any update on this one?

jslouisyou commented 3 weeks ago

Thanks @rollandf for resolving this issue! @SchSeba Is there any plan for next release version?

SchSeba commented 3 weeks ago

Hi @jslouisyou, you only need the sriov-network-device-plugin or you use it via the sriov-network-operator?

jslouisyou commented 3 weeks ago

Hi @SchSeba, I'm using sriov-network-device-plugin along with sriov-network-operator, but it's possible to use by upgrading the sriov-network-device-plugin only.

SchSeba commented 3 weeks ago

Hi @jslouisyou, here is the new tag https://github.com/k8snetworkplumbingwg/sriov-network-device-plugin/releases/tag/v3.8.0