k8snetworkplumbingwg / sriov-cni

DPDK & SR-IOV CNI plugin
Apache License 2.0
298 stars 147 forks source link

sriov-cni plugin silently throws away issues with LoadConfFromCache #275

Closed andreaskaris closed 8 months ago

andreaskaris commented 1 year ago

What happened?

sriov-cni plugin silently throws away issues with LoadConfFromCache: https://github.com/k8snetworkplumbingwg/sriov-cni/blob/c44726a013e0b87470151bebe6a87034f181fa95/cmd/sriov/main.go#L201

What did you expect to happen?

That's on purpose and it makes sense, however, by doing so, we lose insight into valid errors. There should be a logging mechanism to the crio logs (via stderr) which will at least log the failure.

What are the minimal steps needed to reproduce the bug?

I saw this in a customer environment, I actually don't know how to best reproduce this with minimal steps.

Anything else we need to know?

I think it'd be nice to have optional debug logging (loglevel and logfile) for the sriov-cni. That would make troubleshooting in the field (e.g. of race conditions) a lot easier.

Component Versions

Please fill in the below table with the version numbers of applicable components used.

Component Version
SR-IOV CNI Plugin
Multus
SR-IOV Network Device Plugin
Kubernetes
OS

Config Files

Config file locations may be config dependent.

CNI config (Try '/etc/cni/net.d/')
Device pool config file location (Try '/etc/pcidp/config.json')
Multus config (Try '/etc/cni/multus/net.d')
Kubernetes deployment type ( Bare Metal, Kubeadm etc.)
Kubeconfig file
SR-IOV Network Custom Resource Definition

Logs

SR-IOV Network Device Plugin Logs (use kubectl logs $PODNAME)
Multus logs (If enabled. Try '/var/log/multus.log' )
Kubelet logs (journalctl -u kubelet)
adrianchiris commented 8 months ago

@andreaskaris can we close this one ?

andreaskaris commented 8 months ago

of course!