k8snetworkplumbingwg / sriov-network-operator

Operator for provisioning and configuring SR-IOV CNI plugin and device plugin
Apache License 2.0
84 stars 114 forks source link

feat: Update controller logic to handle stale SriovNetworkNodeState CRs with delay #798

Open ykulazhenkov opened 3 weeks ago

ykulazhenkov commented 3 weeks ago

Update controller logic to handle stale SriovNetworkNodeState CRs with delay

This functionality especially useful when the OFED container is in use. As the OFED driver loads on the host, the sriov-config-daemon is removed from this node (achieved using configDaemon nodeselector). Since loading the driver can take a considerable amount of time, we want to ensure that the SriovNetworkNodeState is not lost during this process.

github-actions[bot] commented 3 weeks ago

Thanks for your PR, To run vendors CIs, Maintainers can use one of:

coveralls commented 3 weeks ago

Pull Request Test Coverage Report for Build 11576179409

Details


Changes Missing Coverage Covered Lines Changed/Added Lines %
api/v1/helper.go 14 23 60.87%
controllers/sriovnetworknodepolicy_controller.go 33 47 70.21%
<!-- Total: 47 70 67.14% -->
Totals Coverage Status
Change from base Build 11576125023: 0.2%
Covered Lines: 6708
Relevant Lines: 14863

💛 - Coveralls
ykulazhenkov commented 3 weeks ago

CI failure is not related to the change. The same failure occurs on the PR with dummy changes https://github.com/k8snetworkplumbingwg/sriov-network-operator/pull/800

ykulazhenkov commented 3 weeks ago

@e0ne @adrianchiris I addressed your comments. I also changed behavior a bit to completely avoid any delay in case if STALE_NODE_STATE_CLEANUP_DELAY_MINUTES env is explicitly set to 0