CMCSiemens commented 4 years ago

Describe the bug After a short time from on-boarding the MCnodes stop working giving the following message

"Error occurred during keep alive Error the device was not on-boarded or the response was deleted."

Same for SharedSecret and RSA_3072 tokens.

Oddly this only happens with newly on-boarded nodes as my older flows are still working.

To Reproduce Steps to reproduce the behavior: On board a new device and keep it pushing data for more than and hour.

This issue can be seen on the mindconnect playground as well. Specifically (WaterPump-EnvData)

https://playground.mindconnect.rocks/

Expected behavior Expecting to push data to MindSphere

Screenshots If applicable, add screenshots to help explain your problem.

sn0wcat commented 4 years ago

@CMCSiemens this seems to be the same thing like #39

Could you recreate the docker container if you have one and delete the ./mc/.json file for your node and then re-onboard the agent?

https://opensource.mindsphere.io/docs/node-red-contrib-mindconnect/getting-started.html#tab1anchor5

If you have problems with your agent: Stop the agent. Move or delete the content of the .mc folder (the json files with configuration and authentication settings). Offboard the agent. Create new settings for the mindconnect library. copy the new settings to the node.

sn0wcat commented 4 years ago

Bug Analysis

A probable cause of this bug was the keep alive functionality. Prior to the version 3.7.0. the node wasn't clearing the old interval_id timer after redeployment. This could cause a state where two or more concurrent timers were trying to renew the agent token and sometimes the agent would get tangled in the different renewals.

Mitigation for versions before 3.7.0

Restarting the node-RED after deployment of a new configuration should mitigate the issue.

Fix in version 3.7.0

The version 3.7.0 clears the timers for async logging and token renewal on close similar to the function node of node-red.

(The fix was implemented from alpha version 3.7.0-4)

this.on("close", () => {
            clearInterval(node.interval_id);
            clearInterval(node.await_id);
            node.log("cleared keep alive and async duration interval");
            node.status({});
        });

Additonal features:

There is now a new button in 3.7.0 which can help mitigating agent configuration problems in the future.

@CMCSiemens

mindsphere / node-red-contrib-mindconnect

Error occurred during keep alive #76

Bug Analysis

Mitigation for versions before 3.7.0

Fix in version 3.7.0

Additonal features: