NubeIO / driver-bacnet

0 stars 2 forks source link

No response from target device when device is accessible #121

Closed RaiBnod closed 2 weeks ago

RaiBnod commented 3 weeks ago

Version of driver-bacnet: v1.0.0-rc.5

1. See the MQTT request/response with errors

https://github.com/NubeIO/driver-bacnet/assets/6800775/187e41a2-fcd1-44b4-adf7-46fb4db9d07d

2. Mosquitto restart didn't help

https://github.com/NubeIO/driver-bacnet/assets/6800775/62ed8858-31d6-4b18-9fb8-c68e18e08163

3. driver-bacnet restart did the trick

https://github.com/NubeIO/driver-bacnet/assets/6800775/5fadef92-6467-4eb9-b01e-de84d0146a83

shomaglasang commented 3 weeks ago

Thanks @RaiBnod can you try read any object /properties from the target device or from other device as well if you get the same "no response" errror? And last one, how long did you encounter the "no response" error after restart driver-bacnet?

RaiBnod commented 3 weeks ago

As it was the client device, I can't hit commands.

Shiny380 commented 3 weeks ago

@RaiBnod do you have some ideas on how to replicate this? like disconnecting the slave device several times or for some period of time? or any indication that the slave device is very intermittent with it's connection?

NubeDev commented 3 weeks ago

@shomaglasang

Do you have an auto reconnect loop if you lose connection to the broker?

For example the broker is disable for 5 min and then restarted, do you have logic to handle this

RaiBnod commented 3 weeks ago

@RaiBnod do you have some ideas on how to replicate this? like disconnecting the slave device several times or for some period of time? or any indication that the slave device is very intermittent with it's connection?

If the slave device experiences intermittent connection issues, it fails to work after restarting driver-bacnet. However, it starts working immediately after the driver-bacnet restart. This suggests that something is causing the connections to be held up when we communicate with the slave device. For instance, if we are creating new connections each time and the number of connections grows too large, this issue might occur.

As for replicating the issue, I’ve been trying to reproduce it with our office device but haven’t been successful so far. However, Michael suggested that it might occur when he frequently switches between discovering points.

@shomaglasang

Do you have an auto reconnect loop if you lose connection to the broker?

For example the broker is disable for 5 min and then restarted, do you have logic to handle this`

It's responding No target response from target device in MQTT (also who API responding values), which highly suggests that it's not an issue with MQTT, though confirm it from your side @shomaglasang.

shomaglasang commented 3 weeks ago

@NubeDev @RaiBnod yeah it reconnects when the MQTT connection goes down. But this one, the MQTT connection is good because it is able to read messages over MQTT and respond back with "no response" error. The problem looks like in the bacnet protocol. It didn't receive any reply from the specific MAC. Looks like it's internal routing engine is messed up. I'm checking this on 192.168.15.17 and my devel env but I need to replicate this first.

RaiBnod commented 3 weeks ago

@Ronnel I’ve been able to replicate it in our office device (192.168.15.17, sent video in slack):

So, after waiting for it for a while like 1 hour, it starts responding No response from target device.

Impression: When we request values from offline devices constantly, after a certain period it shows those issues.

shomaglasang commented 3 weeks ago

@NubeDev @RaiBnod This fix is included in v1.0.1. The bug was caused by un-freed transactions. The driver-bacnet supports up to 255 transactions pool. When a bacnet message is sent out (e.g. read or write request) to a device, one transaction is used from the pool and returned back (freed) when there is reply from the device. The problem is when there is no reply. This causes the transaction to be used up indefinitely until the pool is depleted. Once depleted no more messages can be sent unless the driver-bacnet is restarted. I fixed this by freeing the transaction when the request times out.

shomaglasang commented 3 weeks ago

WIll keep this open for a few days after running v1.0.1 on a couple of devices in production.

shomaglasang commented 2 weeks ago

Closing this now after a few days of testing the fix in production devices.