zigpy / zha-device-handlers

ZHA device handlers bridge the functionality gap created when manufacturers deviate from the ZCL specification, handling deviations and exceptions by parsing custom messages to and from Zigbee devices.
Apache License 2.0
734 stars 673 forks source link

[BUG] Since 2023.2.0, Linkind motion sensors reset during motion #2225

Closed jasongabler closed 9 months ago

jasongabler commented 1 year ago

Describe the bug My Linkind PIR Zigbee motion sensors stopped working as expected since 2023.2.0. HA will no longer recognize continued motion from these sensors and seems generally confused until I at least wait for the sensor to go idle. I have concluded that the sensors are not at fault because, when I down grade to 2023.1.7, they perform as expected. I use ZHA with a Conbee II. I have a 2 year old Zigbee mesh that has worked flawlessly until now.

It was recommended I post here from the following official FB group post: https://www.facebook.com/groups/HomeAssistant/permalink/3449772355294110/ Note that the post comments have a number of concerns about ZHA since HA 2023.2.0 was released.

Steps to reproduce the behavior: Using a simple, native HA automation that has worked flawlessly (Lights on with motion; Lights off after 2min of idle)

  1. Walk into sensor zone, lights turn on
  2. Keep moving in sensors zone for over 2min, lights will turn off. I've tested this up to 4 minutes
  3. Leave zone for 1min, walking back into zone and it may turn on. How long I have to wait for HA to recognize idle seems to vary

Again, this not a matter of poor automation configuration. The same, exact automations are in pay for 2023.1.7 and 2023.2.0. I also watch the motion entity and it fails to detect idle or motion according to the above experience with the automation.

Expected behavior

Screenshots N/A

I don't really have much debug details because I really don't even know how to begin debugging this. I will update with more information as I continue to test and investigate. Device signature The only Zigbee related info I see on the device is as follows: Zigbee info IEEE: cc:86:ec:ff:fe:cd:54:35 Nwk: 0x4e5b Device Type: EndDevice LQI: 127 RSSI: -78 Last Seen: 2023-02-25T15:04:06 Power Source: Battery or Unknown ```yaml Paste the device signature here. Don't remove the extra line breaks outside the ``` marks. ```
Diagnostic information ```yaml Paste the diagnostic information here. Don't remove the extra line breaks outside the ``` marks. ```
Additional logs ``` Paste any additional debug logs here. Don't remove the extra line breaks outside the ``` marks. ```

Additional context Add any other context about the problem here.

TheJulianJES commented 1 year ago

Please provide the device signature and diagnostic information for the device. You'll find them if you go to the device page of the motion sensor, then click the three dots on the left. You can download the diagnostic file there and attach it here.

(It'll also include the signature but you can find just that part in the same menu by clicking "Manage Zigbee device" -> "Signature" tab)

The only change I see is that support was added for some kind of Linkind motion sensor by a community member: https://github.com/zigpy/zha-device-handlers/pull/2130 It's possible that this change interferes with your sensors that might be running a different firmware (or similar).

We can check that if you provide signature/diagnostic information. For now, you can download that from the working 2023.1.7 version.

jasongabler commented 1 year ago

Please provide the device signature and diagnostic information for the device. ... The only change I see is that support was added for some kind of Linkind motion sensor by a community member: #2130 It's possible that this change interferes with your sensors that might be running a different firmware (or similar).

We can check that if you provide signature/diagnostic information. For now, you can download that from the working 2023.1.7 version.

As for change #2130, I do know that Linkind has recently changed their firmware which made (at least) their PIR devices non-integratabtle with HA (or at least ZHA). If you go the Amazon ratings page for these devices, starting around 9/2022 Home Assistant users began reporting that the PIR sensors with new firmware no longer worked well with HA. As my devices are considerably older and I have never used the Linkind hub, I have not had these issues. But it would make sense that someone with only the newer firmware would submit a fix that unwittingly had a bad effect on the older firmware'd devices.

Within "Manage Zigbee Device" -> "SIGNTURE" tab:

{
  "node_descriptor": "NodeDescriptor(logical_type=<LogicalType.EndDevice: 2>, complex_descriptor_available=0, user_descriptor_available=0, reserved=0, aps_flags=0, frequency_band=<FrequencyBand.Freq2400MHz: 8>, mac_capability_flags=<MACCapabilityFlags.AllocateAddress: 128>, manufacturer_code=4456, maximum_buffer_size=82, maximum_incoming_transfer_size=82, server_mask=11264, maximum_outgoing_transfer_size=82, descriptor_capability_field=<DescriptorCapability.NONE: 0>, *allocate_address=True, *is_alternate_pan_coordinator=False, *is_coordinator=False, *is_end_device=True, *is_full_function_device=False, *is_mains_powered=False, *is_receiver_on_when_idle=False, *is_router=False, *is_security_capable=False)",
  "endpoints": {
    "1": {
      "profile_id": 260,
      "device_type": "0x0402",
      "in_clusters": [
        "0x0000",
        "0x0001",
        "0x0003",
        "0x0020",
        "0x0500",
        "0x0b05",
        "0xfc81"
      ],
      "out_clusters": [
        "0x0019"
      ]
    }
  },
  "manufacturer": "lk",
  "model": "ZB-MotionSensor-D0003",
  "class": "zigpy.device.Device"
}

The diagnostic file is attached.

Thank you.

zha-3c156512e016bff5e5871b2312226cb3-lk ZB-MotionSensor-D0003-3fe117de98e699e28248dd150cee6a5f.json.txt

TheJulianJES commented 1 year ago

Your signature seems like it would match the quirk introduced in 2023.2.0. The original issue there (https://github.com/zigpy/zha-device-handlers/issues/1291) was that motion sensors would basically never reset. Normally, a compliant motion sensor sends a "detected" and "no longer detected" report. Some Linkind sensors only seem to send a "detected" report, but repeat that like every minute, as long as motion is still detected. The quirk implemented functionality to support that.

For now, you can enable custom quirks (https://github.com/zigpy/zha-device-handlers/discussions/693#discussioncomment-857274)

  1. Create a folder called custom_zha_quirks in your /config directory.

  2. Then, add something like this to your HA configuration.yaml (this assumes you're running Home Assistant OS and specifies the path for the custom quirks. If it doesn't work, try just custom_zha_quirks (so remove the /config/ and trailing slash)

    zha:
    custom_quirks_path: /config/custom_zha_quirks/
  3. Download this file linkind_revert_motion.py.zip, extract the ZIP, and put it into your custom_zha_quirks directory.

  4. Then upgrade to HA 2023.2.5 (or 2023.3.0 betas if you want) and see if the issue still happens.

The custom quirks should basically revert to previous behavior.

javicalle commented 1 year ago

I fell like I come to this kind of issue again and again over time...

And I think that the current implementation must be wrong. Let me to explain myself.

The current implementation for the MotionWithReset listen to the 0x00:status_change_notification command_id (incorrectly identified as the ZONE_STATE attribute): https://github.com/zigpy/zha-device-handlers/blob/f6ab87e1c48ccb6db262e8732ca6b63d67fb6252/zhaquirks/__init__.py#L267-L270

So, every time that a status_change_notification command is received, the code reset the counter (the _timer_handle) and start it again for the reset_s time. So, if a device stops to call the command, the _timer_handle will reset the ZONE_STATE to 0 (and I believe that must be the ZONE_STATUS but that's another issue (or not)): https://github.com/zigpy/zha-device-handlers/blob/f6ab87e1c48ccb6db262e8732ca6b63d67fb6252/zhaquirks/__init__.py#L242-L246

But there is no validation for what value is calling the command, which zone_status value is in the call: https://github.com/zigpy/zigpy/blob/be5b2914a12ac6a19af80c5a83dab40d7957aa01/zigpy/zcl/clusters/security.py#L102-L112

This is not the issue itself, but could affect to the resolution.

So, my guess is that some devices (old firmware?) are sending only the occupied/unoccupied command when the presence/absence is triggered. These were working fine without any quirk but will not with a quirk and the reset_s. Other devices (new firmware?) are calling the occupied command while occupied and stop calling once unoccupied. But it can be also that is calling the command with an 'unoccupied' value, but the implementation handle it as a 'occupied' command.

Then, how can we fix both behaviors? I don't know. We can try to see which are the attributes in the new firmware devices hopping that there is a unoccupied value when no motion detected. If the value is not there I don't know how can we handle both behaviors.

TheJulianJES commented 1 year ago

Regarding the ZONE_STATE vs ZONE_STATUS part, there does seem to be some kind of issue.

This is the cluster_command part that parses the 0x00 "Zone Status Change Notification" command: https://github.com/home-assistant/core/blob/c9dfa15ed611340ed30045486e207ee93247b890/homeassistant/components/zha/core/channels/security.py#L342-L349 From what I can see, that part never actually changes the ZONE_STATUS attribute though. It does correctly call async_set_state on the binary sensor which correctly triggers the sensor entity in HA.

Not sure if/when this is even called, but this looks like it would break for all devices that use MotionWithReset: https://github.com/home-assistant/core/blob/c9dfa15ed611340ed30045486e207ee93247b890/homeassistant/components/zha/binary_sensor.py#L170-L175

So, the _update_attribute(ZONE_STATE, <ON/OFF>) methods look like they should really use ZONE_STATUS. The ZONE_STATE (0x00) constant is also misused for the "Zone Status Change Notification" (0x00). We should probably have a constant for those. I can look at all those changes in a bit probably.


Regarding the problem with this Linkind sensor, that won't help much I guess. It would help if we have the sw_build_id (or app_version or some kind of version identifier) to see if we are really dealing with two "separate sensors" (or if the quirk was created because of network problems ("no motion" message not received properly?)).

jasongabler commented 1 year ago

For now, you can enable custom quirks (#693 (comment))

  1. Create a folder called custom_zha_quirks in your /config directory.

@TheJulianJES,

Is there a way to confirm that this has properly loaded other than the imprecise method of seeing if the symptoms are resolved? Is there some dev tool I can use to inspect?

Thanks.

TheJulianJES commented 1 year ago

@jasongabler Yes, it'll show on the device page if you expand "Zigbee info". You can click the three dots at the bottom there -> Manage Zigbee device -> Signature tab. At the bottom of the signature for "class", you want to see something with "custom" (depends on how the quirk is called).

jasongabler commented 1 year ago

For now, you can enable custom quirks (#693 (comment))

@TheJulianJES,

@TheJulianJES , I can confirm that this quirk has fixed the issue. I can see in the signature that the quirk is in effect:

"class": "linkind_revert_motion.LinkindD0003" and the related automations are behavior as expected.

Thank you!

jasongabler commented 1 year ago

@TheJulianJES ,

Last thought... if I end up buying newer firmware'd versions of these sensors, I'd be stuck again as this quirk would render them useless (and if I remove it, it renders my old sensors useless.). So, will this be integrated into HA/ZHA on a device basis, such that there would be a switch in the UI? Something like "Old Firmware? [ ]"

Thanks!

TheJulianJES commented 1 year ago

Yeah, we need to fix this properly. Keeping it open for now.

(Quirks could match by IEEE as a workaround for you)

github-actions[bot] commented 1 year ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

LAHA2 commented 9 months ago

Yeah, we need to fix this properly. Keeping it open for now.

(Quirks could match by IEEE as a workaround for you)

Was this fix implemented in a later version of

Yeah, we need to fix this properly. Keeping it open for now.

(Quirks could match by IEEE as a workaround for you)

Hi there - do you know if this was implemented in a later version? I'm in the same boat with a handful of these sensors that worked very well through 2022 versions. Upon upgrading to 2023 version, exactly 60 seconds after motion is detected, HA always shows motion 'Cleared' regardless of ongoing motion during that 60 seconds. Only after 60 additional seconds after 'motion cleared', with no motion occurring, will it recognize a motion detected event. Example - I walk in a room - lights on; I continue moving for 60 seconds, lights go off every time. Remain in the room moving for 60+ more seconds, lights still will not come back on / motion remains 'cleared'; I must cease motion for full 60 seconds (ie, leave the room) before the lights then come back on with motion detected event.

TheJulianJES commented 9 months ago

Looking at https://github.com/zigpy/zha-device-handlers/issues/1291 with @1jk's explanation, it seems like the initial fix with resetting after a timeout wasn't really correct. The sensor uses the Alarm_1 bits and Alarm_2 bits of the IasZone status. If one of these bits is 1, ZHA considers the sensor to be "on". This is why it was always on previously (dark or motion).

We should filter out Alarm_2 for now and revert the timeout changes IMO.