There are two situations that can cause the behavior mentioned in the issue.
cannot get master device from PCI address by reading the file in /sys/bus/pci/devices.
2023/09/25 00:14:01 resource map: xxx <--- see target mapping pod to pci address
2023/09/25 00:14:01 nameNetMap: xxx <-- see target mapping from device name to network address
2023/09/25 00:14:01 deviceMap: map[] <-- cannot see target mapping from network address to device ID
dedicated interfaces has not returned to hostinterface info yet. However, some shared interfaces exist.
2023/09/25 00:14:01 resource map: xxx <--- see target mapping pod to pci address
2023/09/25 00:14:01 nameNetMap: map of some interfaces <-- cannot see mapping of target devices
This PR includes
cache of mapping from pciaddress to network device name to prevent failure access (1).
try updating interfaces if device not found in nameNetMap cache and update nameNetMap and masterNameMap
Log with chage:
2023/09/25 05:59:41 GetDeviceMap of xxx
2023/09/25 05:59:41 resource map: xxx
2023/09/25 05:59:41 nameNetMap map: map[ens4:xxx ens5:xxx]
2023/09/25 05:59:41 set deviceMapCache xxx=yyy
2023/09/25 05:59:41 cannot list address on ens3: <nil>
2023/09/25 05:59:41 updated nameNetMap map: map[<target device>:xxx ens4:xxx ens5:xxx] <--- see update log here
2023/09/25 05:59:41 set deviceMapCache xxx=target device
2023/09/25 05:59:41 deviceMap: map[<target device> net:xxx ]
2023/09/25 05:59:41 GetMultiNicNetwork elapsed: 2804 us
2023/09/25 05:59:41 select by net <target device>
2023/09/25 05:59:41 select by net xxx (ens4)
2023/09/25 05:59:41 select by net xxx (ens5)
2023/09/25 05:59:41 xxx SelectNic elapsed: 98420 us
2023/09/25 05:59:41 return: {[ xxx] [ens4 ens5 <target device>]} <--- see target device added here
This PR is to fix issue mentioned in https://github.com/foundation-model-stack/multi-nic-cni/issues/152. This bug is limited to host dedicated CNI where the network device is moved back and forward between host and pod namespace.
The problem can be tracked from daemon log.
There are two situations that can cause the behavior mentioned in the issue.
/sys/bus/pci/devices
.This PR includes
Log with chage:
Signed-off-by: Sunyanan Choochotkaew sunyanan.choochotkaew1@ibm.com