Azure / Industrial-IoT

Azure Industrial IoT Platform
MIT License
523 stars 214 forks source link

OPC UA Heartbeats appear to be lost when large number of nodes are configured #133

Closed jmbrunskill closed 4 years ago

jmbrunskill commented 5 years ago

Describe the bug When we have a large number of nodes configured, in our OPC Publisher module, Heartbeats aren't reliably published for all nodes.

To Reproduce Steps to reproduce the behavior:

  1. Configure a pn.json file with a large number of nodes that don't change values often and all on the same publishing frequency.
  2. Observe lack of publishing heartbeats for some of these nodes

Expected behavior Heartbeats should be published consistently regardless of number of Nodes Published.

Screenshots

image

System Info:

Additional context

jmbrunskill commented 5 years ago

Based on the theory that the OPC UA server's change buffer might be filling up before the OPC Publisher has time to read all the values, I just now set the sampling interval to 40000ms and the publish interval to 1000ms for the node node affected in the screenshot above. This seems to have had the desired effect with the heartbeat being correctly setup for this node.

In the situation where very first message for a node is missed, it appears that the heartbeat is not setup properly for that node.

iotedge logs opcpublisher --follow | grep 112d6b4-0e4-3f803
[08:58:31 DBG]    DisplayName: ns=2;s=NZ025_NZS025FTGW01X_T100.T100_Milk_Treatment.NZ025_M1Pas1C01_CIP_SanitiserUsed\112d6b4-0e4-3f803
[08:58:31 DBG] Setting up 60 sec heartbeat for node 'ns=2;s=NZ025_NZS025FTGW01X_T100.T100_Milk_Treatment.NZ025_M1Pas1C01_CIP_SanitiserUsed\112d6b4-0e4-3f803'.

image

To me it would be ideal if the heartbeat process triggered a 're-read' of the latest value from the OPCUA Server. That way it could re-validate that the data is readable from the OPCUA Server rather than simply having a stale subscription state.

That being said, I think it might be appropriate to have a smaller publishing interval than sampling interval in our environment. Is there a minimum interval we should consider when specifying a publishing interval?

markusstuhler commented 4 years ago

Hi @jmbrunskill

We will fix it like described here: https://github.com/Azure/Industrial-IoT/issues/135 The current implementation does not trigger a "Re-Read", so you're right that if the first messaged is missed for some reason, the heartbeats cannot be sent for that node(s). We will offer an updated implementation of the heartbeat mechanism soon which will utilize OPC UA server capabilities to generate heartbeats.

We'll close this issue as we use the other issue mentioned above to track.

Thanks!