open-horizon / anax

Horizon agent control system
https://open-horizon.github.io/docs/anax/docs/
Apache License 2.0
72 stars 98 forks source link

Bug: Updating node policy from exchange when node is offline, node does not get the updates after it is online again after few hours #3936

Closed hanicornelia closed 3 weeks ago

hanicornelia commented 10 months ago

Describe the bug.

We have an offline node, node-1. We updated the node policy for node-1 through the exchange hub. Approximately 5 hours later, node-1 came back online, but it did not successfully receives the updated node policy.

Describe the steps to reproduce the behavior.

  1. Cut off the internet from node-1 to make sure it is offline.
  2. From exchange hub, retrieve the node policy of node-1 and save it into a json file: hzn exchange node listpolicy node-1 | tee node.policy.json >> /dev/null
  3. Update 1 of the deployment property in the node.policy.json
  4. Publish the updated node policy to node-1: hzn exchange node addpolicy -f node.policy.json node-1
  5. Wait around 5 hours, and restore the internet to node-1
  6. Nothing happened in the eventlog of the node.

Expected behavior.

In the eventlog of the node, it should record entries indicating that the node has received the "Node policy updated with the Exchange copy" notifications. As a result, it will proceed to terminate any previous agreements in accordance with the new deployment properties specified in the updated node policy.

Screenshots.

No response

Operating Environment

Horizon CLI and agent version: 2.30.0-1435 Exchange version: 2.110.3

Additional Information

No response

MaxMcAdam commented 9 months ago

Hi we are looking into solutions for this issue now. In the mean time, if the new node policy does not include the built-in properties, this issue will not occur. So instead of copying the existing policy from the exchange, please add it to the original user-created policy or remove the built-in ("openhorizon...") properties from the exchange copy.

hanicornelia commented 9 months ago

Hello, I tried again by removing the built-in ("openhorizon...") properties from the exchange copy, but the node still does not gets the update after it becomes online again.

For example, this is the node policy retrieved from exchange. 2023-11-23_10-07

This is the node policy I updated and published again. 2023-11-24_09-38

After around 6 hours, i restored the internet connection to the node, but node still does not gets updated. This is the node policy when i retrieved it again from exchange. 2023-11-24_09-43

LiilyZhang commented 3 weeks ago

Now the agent periodically checks it's policy in the exchange so this should no longer be an issue