xZetsubou / hass-localtuya

🔹 A Home Assistant integration to handle Tuya devices locally "fork from localtuya"
https://xzetsubou.github.io/hass-localtuya/
GNU General Public License v3.0
284 stars 30 forks source link

[Bug]: False alarms about sub-device disconnected from a gateway #272

Open Lurker00 opened 2 weeks ago

Lurker00 commented 2 weeks ago

LocalTuya Version

2024.6.0

Home Assistant Version

2024.6.1

Environment

What happened?

In #253 I asked for a warning when a sub-device was detected offline. You've implemented it a different way, whcih leads to false alarms.

Consider the first block of log records ("Disconnected" and "Disconnected gateway" are produced from def disconnected(self), also some debug messages I changed to info, and added "Sub-devices heartbeat started"). There are no messages from this code piece: https://github.com/xZetsubou/hass-localtuya/blob/50e56ff87ead0477dd438e16084b21aa4627e85c/custom_components/localtuya/core/pytuya/__init__.py#L846-L849 (in my mod, it is info rather than warning). Obviously, it was just a disconnect that happens time to time, and the connection was restored in 5 seconds. See the second block how it looks like when the gateway reports devices offline.

So, I'd ask you to return to #253 and implement it the way to produce the warning only when the gateway lists a sub-device offline (line 848 above). Warnings produced here: https://github.com/xZetsubou/hass-localtuya/blob/d48ee13cebbeaef14b9f9914f48140ef985795ac/custom_components/localtuya/coordinator.py#L439-L442 don't work as they should.

E.g. it can be implemented as an optional second parameter to def disconnected(self) method, with a special value to call from async def _action() to produce such a warning. Or, is it possible to call there from device rather than self:

 for cid, device in listener.sub_devices.items(): 
     if cid not in on_devs: 
         device.warning(f"Sub-device disconnected: {cid}") 
         device.disconnected() 

somehow? The sub_devices dictionary contains TuyaListener objects, but here they are, actually, also ContextualLogger objects. I'm still not a Python expert.

Steps to reproduce.

Don't know

Relevant log output

2024-06-09 12:02:05.217 INFO (MainThread) [custom_components.localtuya.coordinator] [bf8...m91 - К: T&H Сенсор] Disconnected gateway
2024-06-09 12:02:05.217 INFO (MainThread) [custom_components.localtuya.coordinator] [bf8...m91 - К: T&H Сенсор] Disconnected
2024-06-09 12:02:05.217 INFO (MainThread) [custom_components.localtuya.coordinator] [bf5...daj - КЛ1: LED BULB B5K] Disconnected
2024-06-09 12:02:05.218 INFO (MainThread) [custom_components.localtuya.coordinator] [bff...mj7 - КЛ2: LED BULB B5K] Disconnected
2024-06-09 12:02:05.218 INFO (MainThread) [custom_components.localtuya.coordinator] [bff...jl9 - КЛ3: LED BULB B5K] Disconnected
2024-06-09 12:02:05.225 WARNING (SyncWorker_41) [custom_components.localtuya.coordinator] [bf8...m91 - К: T&H Сенсор] Sub-device disconnected from: 192.168.0.33
2024-06-09 12:02:05.226 WARNING (SyncWorker_41) [custom_components.localtuya.coordinator] [bf5...daj - КЛ1: LED BULB B5K] Sub-device disconnected from: 192.168.0.33
2024-06-09 12:02:05.227 WARNING (SyncWorker_11) [custom_components.localtuya.coordinator] [bff...mj7 - КЛ2: LED BULB B5K] Sub-device disconnected from: 192.168.0.33
2024-06-09 12:02:05.228 WARNING (SyncWorker_20) [custom_components.localtuya.coordinator] [bff...jl9 - КЛ3: LED BULB B5K] Sub-device disconnected from: 192.168.0.33
2024-06-09 12:02:08.222 WARNING (SyncWorker_47) [custom_components.localtuya.coordinator] [bf8...m91 - К: T&H Сенсор] Disconnected: waiting for discovery broadcast
2024-06-09 12:02:09.879 INFO (MainThread) [custom_components.localtuya.coordinator] [bf8...m91 - К: T&H Сенсор] Success: connected to: 192.168.0.33
2024-06-09 12:02:09.880 INFO (MainThread) [custom_components.localtuya.coordinator] [bf8...m91 - К: T&H Сенсор] Sub-devices heartbeat started 192.168.0.33
2024-06-09 12:02:09.899 INFO (MainThread) [custom_components.localtuya.coordinator] [bf8...m91 - К: T&H Сенсор] Success: connected to: 192.168.0.33
2024-06-09 12:02:10.082 INFO (MainThread) [custom_components.localtuya.coordinator] [bf5...daj - КЛ1: LED BULB B5K] Success: connected to: 192.168.0.33
2024-06-09 12:02:10.086 INFO (MainThread) [custom_components.localtuya.coordinator] [bff...mj7 - КЛ2: LED BULB B5K] Success: connected to: 192.168.0.33
2024-06-09 12:02:10.097 INFO (MainThread) [custom_components.localtuya.coordinator] [bff...jl9 - КЛ3: LED BULB B5K] Success: connected to: 192.168.0.33

2024-06-09 16:54:45.460 INFO (MainThread) [custom_components.localtuya.core.pytuya] [bf5...whd - КЛ: Zigbee Bulb] Sub-device disconnected: a4c1389aeca68fbf
2024-06-09 16:54:45.461 INFO (MainThread) [custom_components.localtuya.coordinator] [bf5...whd - КЛ: Zigbee Bulb] Disconnected
2024-06-09 16:54:45.462 INFO (MainThread) [custom_components.localtuya.core.pytuya] [bf5...whd - КЛ: Zigbee Bulb] Sub-device disconnected: a4c138f46f838ee0
2024-06-09 16:54:45.462 INFO (MainThread) [custom_components.localtuya.coordinator] [bfe...7yz - ГС: Motion Sensor] Disconnected
2024-06-09 16:54:45.463 INFO (MainThread) [custom_components.localtuya.core.pytuya] [bf5...whd - КЛ: Zigbee Bulb] Sub-device disconnected: a4c138a112e22ebe
2024-06-09 16:54:45.463 INFO (MainThread) [custom_components.localtuya.coordinator] [bfa...hoi - К: 3x Переключатель] Disconnected
2024-06-09 16:54:45.464 INFO (MainThread) [custom_components.localtuya.core.pytuya] [bf5...whd - КЛ: Zigbee Bulb] Sub-device disconnected: a4c1380f8554dc9d
2024-06-09 16:54:45.465 INFO (MainThread) [custom_components.localtuya.coordinator] [bf0...tdq - ГС: Выключатель] Disconnected
2024-06-09 16:54:45.465 INFO (MainThread) [custom_components.localtuya.core.pytuya] [bf5...whd - КЛ: Zigbee Bulb] Sub-device disconnected: a4c1384bb9cd4e1d
2024-06-09 16:54:45.466 INFO (MainThread) [custom_components.localtuya.coordinator] [bf1...xw5 - К: Выключатель] Disconnected
2024-06-09 16:54:45.468 INFO (MainThread) [custom_components.localtuya.core.pytuya] [bf5...whd - КЛ: Zigbee Bulb] Sub-device disconnected: a4c138717b6a9928
2024-06-09 16:54:45.468 INFO (MainThread) [custom_components.localtuya.coordinator] [bf0...hwo - К: M&L Сенсор] Disconnected
2024-06-09 16:54:45.469 INFO (MainThread) [custom_components.localtuya.core.pytuya] [bf5...whd - КЛ: Zigbee Bulb] Sub-device disconnected: 00124b00272127b7
2024-06-09 16:54:45.470 INFO (MainThread) [custom_components.localtuya.coordinator] [bf8...mlc - К: Роутер] Disconnected

Diagnostics information.

No response

Lurker00 commented 2 weeks ago

It seems it should work:

                for cid, device in listener.sub_devices.items():
                    if cid not in on_devs:
                        if isinstance(device, ContextualLogger):
                            contextualLogger: ContextualLogger = device
                            contextualLogger.warning(f"Sub-device is offline: {cid}")
                        else:
                            self.warning(f"Sub-device disconnected: {cid}")
                        device.disconnected()

but I didn't try yet (not at home). The else: should never happen, though.

Lurker00 commented 2 weeks ago

But a parameter "offline" seems a better solution, because it makes possible here: https://github.com/xZetsubou/hass-localtuya/blob/d48ee13cebbeaef14b9f9914f48140ef985795ac/custom_components/localtuya/coordinator.py#L480-L483 change to something like this:

    def disconnected(self, offline=False):
...
            delay = (0 if offline else 3) + sleep_time

and call in _action()

         device.disconnected(offline=True)
xZetsubou commented 2 weeks ago

The "ContextualLogger" is Sub Class inside Listener Class so isinstance(device, ContextualLogger) will always be true.

What do you exactly mean by false alarms can you explain more? does it disconnect a device that is online? can you also put the logs comes from _action() "Sub-Devices States Update ..."

Lurker00 commented 2 weeks ago

What do you exactly mean by false alarms

It produces a warning when the device is not reported as offline by the gateway. The feature request #253 was to warn about a sub-device went offline as reported by its gateway. But, you see, currently, disconnected() is called from different places, without any knowledge about disconnect reason, while disconnects may happen even when all the devices are online.

can you also put the logs comes from _action() "Sub-Devices States Update ..."

That's what I've already done in the second log lines block, after 2 empty lines. They are from another case, when _action() detected offline state. The first block does not contain lines from _action(), because there was no an offline report from the gateway.

Lurker00 commented 2 weeks ago

Probably, I need to explain:

When a sub-device is reported offline by a gateway, it usualy means dead battery or weak signal (Zigbee or Bluetooth). It requires an action from the user.

Devices disconnect for many reasons, and usualy re-connect is successful in some seconds, or even fraction of a second. Only if the subsequent reconnect was not successful (_shutdown_entities was called), it worth to pay attention. But, with the current implementation, the warning was produced with subsequent successful reconnect.

Lurker00 commented 2 weeks ago

Not only delay calculation, but also, the line https://github.com/xZetsubou/hass-localtuya/blob/d48ee13cebbeaef14b9f9914f48140ef985795ac/custom_components/localtuya/coordinator.py#L476 should be changed to

        if not self._is_closing and not offline and not self._quick_retry:

Instead, the attempt to reconnect may happen only after 5 seconds (RECONNECT_INTERVAL). Before 2024.6.0, I saw successful reconnects in less than a second.

Lurker00 commented 2 weeks ago

Here is how it works with the suggested changes (in https://github.com/xZetsubou/hass-localtuya/issues/272#issuecomment-2158247518 and https://github.com/xZetsubou/hass-localtuya/issues/272#issuecomment-2158551629):

Log ``` 2024-06-10 22:21:00.695 INFO (MainThread) [custom_components.localtuya.coordinator] [bff...hco - ВС: Выключатель] Disconnected as gateway 2024-06-10 22:21:00.695 INFO (MainThread) [custom_components.localtuya.coordinator] [bff...hco - ВС: Выключатель] Disconnected 2024-06-10 22:21:00.697 INFO (MainThread) [custom_components.localtuya.coordinator] [bf0...f1a - ВС: T&H Сенсор] Disconnected 2024-06-10 22:21:00.698 INFO (MainThread) [custom_components.localtuya.coordinator] [bf4...huv - ВС: Zigbee Bulb] Disconnected 2024-06-10 22:21:00.698 INFO (MainThread) [custom_components.localtuya.coordinator] [bf9...5wg - ВС: Чёрная кнопка] Disconnected 2024-06-10 22:21:00.699 INFO (MainThread) [custom_components.localtuya.coordinator] [bfe...edr - ВС: Реле 4х] Disconnected 2024-06-10 22:21:00.701 INFO (MainThread) [custom_components.localtuya.coordinator] [bf4...l4c - ВЛ1: Beacon RGBCW] Disconnected 2024-06-10 22:21:00.703 INFO (MainThread) [custom_components.localtuya.coordinator] [bf4...sqd - ВЛ2: Beacon RGBCW] Disconnected 2024-06-10 22:21:00.707 INFO (MainThread) [custom_components.localtuya.coordinator] [bf6...hwj - ВЛ3: LED BULB B509Z2] Disconnected 2024-06-10 22:21:00.918 INFO (MainThread) [custom_components.localtuya.coordinator] [bff...hco - ВС: Выключатель] Connected as gateway to: 192.168.0.32 2024-06-10 22:21:00.934 INFO (MainThread) [custom_components.localtuya.coordinator] [bff...hco - ВС: Выключатель] Sub-devices heartbeat started 192.168.0.32 2024-06-10 22:21:01.081 INFO (MainThread) [custom_components.localtuya.coordinator] [bff...hco - ВС: Выключатель] Success: connected to: 192.168.0.32 2024-06-10 22:21:01.099 INFO (MainThread) [custom_components.localtuya.coordinator] [bf0...f1a - ВС: T&H Сенсор] Success: connected to: 192.168.0.32 2024-06-10 22:21:01.105 INFO (MainThread) [custom_components.localtuya.coordinator] [bf4...huv - ВС: Zigbee Bulb] Success: connected to: 192.168.0.32 2024-06-10 22:21:01.119 INFO (MainThread) [custom_components.localtuya.coordinator] [bf9...5wg - ВС: Чёрная кнопка] Success: connected to: 192.168.0.32 2024-06-10 22:21:01.123 INFO (MainThread) [custom_components.localtuya.coordinator] [bfe...edr - ВС: Реле 4х] Success: connected to: 192.168.0.32 2024-06-10 22:21:01.128 INFO (MainThread) [custom_components.localtuya.coordinator] [bf4...l4c - ВЛ1: Beacon RGBCW] Success: connected to: 192.168.0.32 2024-06-10 22:21:01.134 INFO (MainThread) [custom_components.localtuya.coordinator] [bf4...sqd - ВЛ2: Beacon RGBCW] Success: connected to: 192.168.0.32 2024-06-10 22:21:01.211 INFO (MainThread) [custom_components.localtuya.coordinator] [bf6...hwj - ВЛ3: LED BULB B509Z2] Success: connected to: 192.168.0.32 ```

Reconnect in < 300 ms. _shutdown_entities wasn't called. No a warning meaning sub-device went offline.

coordinator.py/def disconnected(self, offline): ```python @callback def disconnected(self, offline): """Device disconnected.""" sleep_time = self._device_config.sleep_time self._interface = None if self.sub_devices: self.warning(f"Disconnected as gateway{", offline" if offline else ""}") for sub_dev in self.sub_devices.values(): sub_dev.disconnected(offline=False) elif offline: self.warning(f"Disconnected, offline") else: self.info(f"Disconnected") if self._connect_task is not None: self._connect_task.cancel() self._connect_task = None # If it disconnects unexpectedly. if not self._is_closing and not self.is_subdevice and not self._quick_retry: self._quick_retry = True asyncio.run_coroutine_threadsafe(self.async_connect(), self._hass.loop) if not self._is_closing: delay = (0 if offline else 3) + sleep_time self._unsub_on_close.append( async_call_later(self._hass, delay, self._shutdown_entities) ) ```
xZetsubou commented 2 weeks ago

So in short, there is no issue with the warnings it just not specified if it's because it's offline or not? that's is it?

Lurker00 commented 2 weeks ago

Not so easy.

For me, it seems the whole logic of handling different reasons of disconnects (sub-device went offline, connection was lost for whatever reason, or LocalTuya closes the connection itself for whatever reason) is broken. _shutdown_entities shall not be called at all when quick retry is possible. Also, the warning in _shutdown_entities produces too many useless messages when the connection with a gateway can't be re-established via quick retry.

I insist that the warning about a sub-device went offilne shall be issued in the place of the event: in _action.

Also, I believe that quick retry attempt (both delay calculation and call to self.async_connect() in disconnected()) shall be made with a knowledge of the disconnect reason. My current implementation can be incorrect, but it works better than your's.

My current code with warnings produced in disconnected is good for my current tests, to receive Telegram notifications about such events without additional automation involved. It is not required in the release version.

xZetsubou commented 2 weeks ago

tbh I don't prefer sending the warnings from the core module pytuya rather then doing that we can just add exec arg for disconnected as and pass the reason within it. I also don't like warnings at all. The core shouldn't act as a class for Home Assistant.

If _shutdown_entities haven't been called and the quick retry fails then this will only break the entities as it shows that it's available but it's not. this what is the delay for as it will shut them down if the device doesn't reconnected within 3 seconds after the quick retry.

Also the reconnect task I'm going to actually refactor it and the should be inside the TuyaDevice object. and in LocalTuya HASS Object.

I'll make the changes within a few days.

Lurker00 commented 2 weeks ago

The core shouldn't act as a class for Home Assistant

Thank you for explanation! Now I understand your point.

I'm going to actually refactor

Thanks a lot for your efforts! LocalTuya is already much better than a couple of months ago, when I started to use it.