cilium / cilium

eBPF-based Networking, Security, and Observability
https://cilium.io
Apache License 2.0
19.16k stars 2.78k forks source link

datapath/node: avoid fatal on daemon init when node routes fail. #33219

Open tommyp1ckles opened 1 week ago

tommyp1ckles commented 1 week ago

Prior to 9486e7b731903eb949cb2fcd70f08e1da386dc5d the call to reconcile the node (i.e. nodeUpdated) did not have errors handled. This meant that (*linuxNodeManager).NodeConfigurationChanged(...) would not return an error if there the underlying node reconciliation failed.

This was the assumption prior to v1.15 that node reconciliation tasks log errors in-line and continue exection without returning early. Following these changes, the call to NodeConfigurationChanged will now return an error if the underlying nodeUpdated call fails.

Issue #31843 is a result of the now incorrect dependency on not terminating execution in case of nodeUpdated failure where in cases where (unreachable) node routes for remote nodes failing was previously silently logged - will now cause the agent to terminate on init if this fails. There may be other such regressions built on these expectations.

This reverts to behavior found in v1.14 where this will log nodeUpdated failures and continue execution.

Fixes: #31843

Please ensure your pull request adheres to the following guidelines:

Fixes: #issue-number

<!-- Enter the release note text here if needed or remove this section! -->
tommyp1ckles commented 1 week ago

/test

tommyp1ckles commented 1 week ago

/test