zigpy / zha-device-handlers

ZHA device handlers bridge the functionality gap created when manufacturers deviate from the ZCL specification, handling deviations and exceptions by parsing custom messages to and from Zigbee devices.
Apache License 2.0
720 stars 669 forks source link

[BUG] Custom quirks not working in HA 2023.03 #2250

Closed javicalle closed 1 year ago

javicalle commented 1 year ago

Recently in HA v2023.03, the ZHA integration has stopped working due to issues with some user-installed custom quirks.

The issue is caused by some changes made to Tuya's MCU device implementations. These changes (necessary and brilliantly implemented) could have been deployed in a different way to avoid the impact on integration and the noise generated. This could be fixed with PR #2242, but it only mitigates the effect of possible changes.

It should be said that the use of custom quirks implies (or should imply) some technical knowledge about what is being done and about the analysis of the incidents that can be generated. On many occasions the problem has been detected and corrected by the users, but not all users are the same.

My intention is to collect here the problem that occurred, the possible solutions and lessons learned.

The first thing is to identify the origin (or origins) of the problem. I have detected mainly 2:

Nothing to reproach the changes. I gave them my approval myself.

But it is clear that we underestimate the impact of change. Although all the code implemented in the library was modified according to the new implementation, the custom quirks hosted in the user installations have not, and these have affected the integration. In order to mitigate this type of problem, it would have been convenient to grant a deprecation time to the implemented changes. It will not always be achievable, but at least we should considerate it.

Let it be clear that this issue is only addressed to myself and that he only seeks to collect my comments on it.

javicalle commented 1 year ago

Error type:

Logger: homeassistant.config_entries
Source: custom_zha_quirks/zhaquirks/ts0601_TZE200_akjefhj5_valve.py:25
First occurred: 00:28:30 (1 occurrences)
Last logged: 00:28:30

Error setting up entry ConBee II, s/n: DE2463913 - dresden elektronik ingenieurtechnik GmbH for zha
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/config_entries.py", line 383, in async_setup
    result = await component.async_setup_entry(hass, self)
  File "/usr/src/homeassistant/homeassistant/components/zha/__init__.py", line 100, in async_setup_entry
    setup_quirks(config)
  File "/usr/local/lib/python3.10/site-packages/zhaquirks/__init__.py", line 409, in setup
    importer.find_module(modname).load_module(modname)
  File "<frozen importlib._bootstrap_external>", line 548, in _check_name_wrapper
  File "<frozen importlib._bootstrap_external>", line 1063, in load_module
  File "<frozen importlib._bootstrap_external>", line 888, in load_module
  File "<frozen importlib._bootstrap>", line 290, in _load_module_shim
  File "<frozen importlib._bootstrap>", line 719, in _load
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/config/custom_zha_quirks/zhaquirks/ts0601_TZE200_akjefhj5_valve.py", line 25, in <module>
    from zhaquirks.tuya.mcu import (
ImportError: cannot import name 'TuyaDPType' from 'zhaquirks.tuya.mcu' (/usr/local/lib/python3.10/site-packages/zhaquirks/tuya/mcu/__init__.py)

Solution:

javicalle commented 1 year ago

Error type:

It seems that there are no logs related but some entities stop working after update. It may be affected if the quirks made use of the DPToAttributeMapping.converter, ie:

    dp_to_attribute: Dict[int, DPToAttributeMapping] = {
        104: DPToAttributeMapping(
            TuyaIlluminanceMeasurement.ep_attribute,
            "measured_value",
            dp_type=TuyaDPType.VALUE,
            converter=lambda x: math.log10(x) * 10000 + 1 if x > 0 else 0,  # <-- like this one
        ),
    }

Solution:

An explicit conversion to the data type must be implemented in order for the code to perform the proper conversion.

For the previous case, the correction would be the conversion to int value:

    dp_to_attribute: Dict[int, DPToAttributeMapping] = {
        104: DPToAttributeMapping(
            TuyaIlluminanceMeasurement.ep_attribute,
            "measured_value",
            # dp_type=TuyaDPType.VALUE,
            converter=lambda x: int(math.log10(x) * 10000 + 1) if x > 0 else int(0),
        ),
    }

Ref: https://github.com/zigpy/zha-device-handlers/pull/1928#issuecomment-1451284964

javicalle commented 1 year ago

Reserved

javicalle commented 1 year ago

Reserved

MattWestb commented 1 year ago

The reason for implanting custom quirk is for easy testing quirk for devs and users. Its not made for making local quirk and not putting on PR for adding it and its no guarantee that ZHA / Zigpy is braking them like with my tuya TRV that is not PRed then no have writing the test for it and its the same for most of this devices and its have being broken.

The most impotent is knowing and informing users then making braking changes also for custom quirk for getting it working well also then the device is not officiously supported then the quirk is not PRed. Also making good instructions so user can updating there quirk ten it happening so they can getting there devices working OK a gen.

Braking thins is always not welcome but is always happening and its importing fixing them like the large reworking of ZCL R7 that was braking very mush bit it wars very needed and we is getting good things from it all the time after its was done and i think its the same with the tuya MCU updates (and converting most tuya devices toEnchantedDevice that also is very needed and can braking some devices).

Understanding and learning then things is not going well it is the definition of one intelligent animal and here is most very intelligent :-))

TheJulianJES commented 1 year ago

Yeah, I'd consider custom quirks as a temporary workaround.

With the amount of users that use custom quirks, it's not ideal that ZHA currently completely breaks, but those quirks are not (and can't be) officially supported unless they're merged into zha-quirks. And the existence of custom quirks that break should not stop larger changes (like the Tuya data type stuff).

Thanks javicalle for the guide on fixing those quirks though!

In order to mitigate this type of problem, it would have been convenient to grant a deprecation time to the implemented changes. It will not always be achievable, but at least we should considerate it.

I understand what you mean, but I'm not sure that would really make a difference. There was a similar change some time ago (I think something in zigpy got deprecated) and there were still reports of custom quirks breaking when it was finally removed in zigpy (even though there was a warning log message before). It's also a bit more effort (and I'd rather put that into merging custom quirks into zha-quirks).

MattWestb commented 1 year ago

The ZHA is loading custom quirks i think it shall throwing one warning for every quirk that is loaded from the custom directory so the user is not forgetting that is one test function and not one permanent solution that is making on PR for betting it main ZHA. One plus experience user can easy see that the quirk is loaded and dont need digging deep in the HA log for see it :-))

As sad custom quirks is user supported and only warning if knowing some is braking is needed then making changes but in the first row is Zigpy / zha for getting the system working well and its one must doing large complex changes that can being only 95% tested before merger them.

By the was great thanks for the work on tuya MCU command part that was very needed as the implanting of quirk device classes for matching functions in ZHA !!!

javicalle commented 1 year ago

I am sure that any measure will not cover all cases, but if the avalanche of "ZHA breaks in version XX" incidents can at least be avoided, it is worth it. (And on a personal level I will know that I have tried to avoid it).

In my head it would be easy if we warn users were it bother more, in the HA notifications panel: image

I think that is a interesting approach for all these 'hit & run' users, but not enough time or enough knowledge to explore that approach and do it myself.

I don't like the idea of quirks failing 'silently' but totally against that a quirk (and even worse, a custom quirk) breaking the ZHA integration.

Thanks for your words. I really appreciate it.

MattWestb commented 1 year ago

In setting > system > Repairs looks being one good place putting custom quirks warnings then the user can muting it to next manor update and its being re triggered like having portainer installed in one container and have suprevised HA installed or skipping updates of HA and addons.

TheJulianJES commented 1 year ago

Hmm, I don't think a PR with those changes would be accepted. Custom quirks are considered to be similar to custom integrations. See: https://github.com/home-assistant/home-assistant.io/pull/23884#discussion_r955050091

MattWestb commented 1 year ago

Wish full thinking from my side ;-)

So ZHA throwing warning in the log is the only realistic option.

puddly commented 1 year ago

There was a similar change some time ago (I think something in zigpy got deprecated) and there were still reports of custom quirks breaking when it was finally removed in zigpy (even though there was a warning log message before).

Yeah, warnings won't help as much as they should. I deprecated zigpy.zcl.foundation.Command in favor of zigpy.zcl.foundation.GeneralCommand in December of 2021. When I finally removed it in December of 2022 (over a year later!), it still broke a ton of installs and generated a lot of issues because many custom quirk users are non-technical and do not read warnings to begin with.

There isn't really much that can be done other than merging the custom quirks into the main repo, or letting things break (either explicitly or silently). I also agree with @TheJulianJES regarding custom quirks integrating with the frontend: they're 100% meant for development and quick testing, not for general use.

I vote for merging #2242 to at least prevent blocking startup and upgrading the warnings to errors.

Nixellion commented 1 year ago

Is there an easy was to see if a custom quirk was PRed? I think if quirk was added to connect a device that previously had no quirk at all - it should even be quite easy for Hass to perform a check and inform user, saying something "Hey, you're using a custom quirk for device X but we now support it officially" or something like that.

If a custom quirk was used to fix a bug for a device that already has a quirk - well, probably not much to do here, but maybe there's some elegant solution to this as well.

Horley900 commented 1 year ago

Im not sure if this is the right place or what i need tosupply but none of my zha quirks are working as of yesterday, this is my 2 gang switch: Device info TS0012 by _TZ3000_18ejxno0 Connected via Zigbee Coordinator Zigbee info IEEE: a4:c1:38:c4:84:e0:00:9f Nwk: 0xc292 Device Type: EndDevice LQI: Unknown RSSI: Unknown Last Seen: 2023-04-16T14:27:37 Power Source: Battery or Unknown Quirk: zhaquirks.tuya.ts001x.TuyaDoubleNoNeutralSwitch_2

and my three gang switch: TS0013 by _TZ3000_qewo8dlz Connected via Zigbee Coordinator Zigbee info IEEE: a4:c1:38:ba:03:4e:ae:bc Nwk: 0x5d57 Device Type: EndDevice LQI: Unknown RSSI: Unknown Last Seen: 2023-04-16T14:28:21 Power Source: Battery or Unknown Quirk: Tuya_3_gang_switch.TuyaTripleNoNeutralSwitchVar02

what is happening with my switchs is that when activate one button is activated via home assistant all the buttons activate on that switch. previously the quirk had resolved this but out of nowhere its stopped working

javicalle commented 1 year ago

Your 3 gang is a local quirk. Have you tried the fix from this post? Maybe even is not needed nowadays if your device is already in the current version.

The 2 gang don't have a local quirk, so if there is a problem you should open a new issue with all the required info, in example, what is not working.