thingsboard / thingsboard-gateway

Open-source IoT Gateway - integrates devices connected to legacy and third-party systems with ThingsBoard IoT Platform using Modbus, CAN bus, BACnet, BLE, OPC-UA, MQTT, ODBC and REST protocols
https://thingsboard.io/docs/iot-gateway/what-is-iot-gateway/
Apache License 2.0
1.74k stars 844 forks source link

[BUG] Ever increasing data latency when connecting to multiple OPC-UA servers #992

Closed AndreSensaway closed 1 year ago

AndreSensaway commented 1 year ago

In my setup, I have the thingsboard gateway connecting to 12 separate OPC-UA servers (siemens s7-1200), monitoring around 150 tags each.

In the _tb_gatewayservice.py file, the self._eventstorage queue was continuously increasing in size and I was receiving data on the thingsboard application side with an ever increasing delay.

I arrived to the conclusion that the section of code that processes the self._publishedevents queue was taking too long because it processes 1 event at a time.

I made the following code change to the tb_gateway_service.py that solved my problem. From this:

https://github.com/thingsboard/thingsboard-gateway/blob/114a33dff709e94b52e96e74d2b7f15ef9f73501/thingsboard_gateway/gateway/tb_gateway_service.py#L845-L865

To this:

while not self._published_events.empty():
    if (self.__remote_configurator is not None and self.__remote_configurator.in_process) or \
            not self.tb_client.is_connected() or \
            self._published_events.empty() or \
            self.__rpc_reply_sent:
        success = False
        break

    # "event" variable became "events" list
    events=[self._published_events.get(False, 10) for _ in range(min(50, self._published_events.qsize()))]  

    # added this for loop
    for event in events: 
        try:
            if self.tb_client.is_connected() and (
                    self.__remote_configurator is None or not self.__remote_configurator.in_process):
                if self.tb_client.client.quality_of_service == 1:
                    success = event.get() == event.TB_ERR_SUCCESS
                else:
                    success = True
            else:
                break

        except Exception as e:
            log.exception(e)
            success = False
    sleep(0.2)

This solved my latency issues and now all data is being pushed to the thingsboard application in real-time. Is this something that can be fixed in a new release? (I'm using the gateway 3.1 release version but I checked in the 3.2 release and it seems to have the same problem)

Also open to suggestions for a better fix! thanks

**Versions

samson0v commented 1 year ago

Hi @AndreSensaway, thanks for your interest in ThingsBoard IoT Gateway! Thanks for your solution but there is some strange constant 50 where you generate a list (if in a queue will not be 50 packs?)

I suggested changing the delay in sleep function in all while from .2 to self.__min_pack_send_delay_ms. self.__min_pack_send_delay_ms - you can configure it in tb_gateway.yaml file in thingsboard section (minPackSendDelayMS: 200). So simply decrease this value to 50 or 100 ms but notice that it increases CPU usage.

AndreSensaway commented 1 year ago

Thanks for your answer.

The constant 50 used in generating the events list is to limit the number of elements retrieved from the queue each time. It either gets all elements available in the queue or if the queue size is larger than 50 it gets the first 50 elements (this value was chosen because it suits my solution, ideally it should be parameterized).

I've tried changing the sleep time, as you mentioned, but it still doesn't completely fix the problem... For example, if I have 100 elements in the self._published_events queue and I change the sleep time to 50ms it will still take more than 5 seconds to empty the queue and during this time the queue is filling up with other events. Especially during startup where all tags from the OPC-UA servers will be updated at once, this section of the code is a bottleneck.

I don't know if this solution will have unintended consequences in the gateway overall so I understand if it will not be changed in future releases. In my use case, each connected OPC-UA server has hundreds of tags that will go to a couple dozen of devices in the thingsboard application, and this solution worked well for me.

Thanks.

samson0v commented 1 year ago

@AndreSensaway ok, will add this parameter