zigpy / zha-device-handlers

ZHA device handlers bridge the functionality gap created when manufacturers deviate from the ZCL specification, handling deviations and exceptions by parsing custom messages to and from Zigbee devices.
Apache License 2.0
683 stars 635 forks source link

[BUG] / Request send default response in wrong direction from quirk. #1448

Open MattWestb opened 2 years ago

MattWestb commented 2 years ago

Describe the bug Our beloved TS004F Dimmer switch is re sending commands also after sending default response to it. Its described in this PR https://github.com/zigpy/zha-device-handlers/pull/1437.

To Reproduce Have sniffing tuya ZBGW and they is doing the same

Expected behavior Default response shall working OK but the device is doing all possible things wrong and this is one firmware bug. Reading little more i was finding that Smart things is sending default response for EP2-4 like this:

Frame Control Field: Profile-wide (0x00)
    .... ..00 = Frame Type: Profile-wide (0x0)
    .... .0.. = Manufacturer Specific: False
    .... 0... = Direction: Client to Server
    ...0 .... = Disable Default Response: False

And Zigbee standard shall being like this and tuya ZBGW and ZHA is doing it so:

Frame Control Field: Profile-wide (0x18)
    .... ..00 = Frame Type: Profile-wide (0x0)
    .... .0.. = Manufacturer Specific: False
    .... 1... = Direction: Server to Client
    ...1 .... = Disable Default Response: True

ST is doing one thing wrong you is not sending one default response for one default response but there is sending it server to client and the device is very happy and is not re sending the command and is sending the next without delay.

Screenshots If applicable, add screenshots to help explain your problem.

Additional context

Z2M look have implanting it and looks working OK but they is sending 4 DR.

I like getting hep implanting sending "revers" DR from quirks (force Client to Server and perhaps in the future the opposite if needed) so we can getting the device responding for multiple commands without delay (is now 3 seconds between commands).

One funny but not unexpected thing is that the device is triggering reporting battery then pressing the button for EP2 and its start reporting battery on EP1 and getting one normal DR and its OK and then sending the "on" command from EP2 without delay but is being repeated then the device like having it in the reversed direction (Client to Server and not the standard Server to Client).

Sniff of attribute reported the pressing button 2 and reporting battery attribute and the command:

EP2 Event:
First its reporting battery from EP1:

ZigBee Network Layer Data, Dst: 0x0000, Src: 0xc29e
ZigBee Application Support Layer Data, Dst Endpt: 1, Src Endpt: 1
ZigBee Cluster Library Frame, Command: Report Attributes, Seq: 73
    Frame Control Field: Profile-wide (0x08)
        .... ..00 = Frame Type: Profile-wide (0x0)
        .... .0.. = Manufacturer Specific: False
        .... 1... = Direction: Server to Client
        ...0 .... = Disable Default Response: False
    Sequence Number: 73
    Command: Report Attributes (0x0a)
    Attribute Field
        Attribute: Battery Percentage Remaining (0x0021)
        Data Type: 8-Bit Unsigned Integer (0x20)
        Remaining Battery Percentage: 94,0 [%]

ZigBee Network Layer Data, Dst: 0x0000, Src: 0xc29e
ZigBee Application Support Layer Data, Dst Endpt: 1, Src Endpt: 1
ZigBee Cluster Library Frame, Command: Report Attributes, Seq: 73
    Frame Control Field: Profile-wide (0x08)
        .... ..00 = Frame Type: Profile-wide (0x0)
        .... .0.. = Manufacturer Specific: False
        .... 1... = Direction: Server to Client
        ...0 .... = Disable Default Response: False
    Sequence Number: 73
    Command: Report Attributes (0x0a)
    Attribute Field
        Attribute: Battery Percentage Remaining (0x0021)
        Data Type: 8-Bit Unsigned Integer (0x20)
        Remaining Battery Percentage: 94,0 [%]

Then is sending the command from EP2:

ZigBee Network Layer Data, Dst: 0x0000, Src: 0xc29e
ZigBee Application Support Layer Data, Dst Endpt: 1, Src Endpt: 2
ZigBee Cluster Library Frame
    Frame Control Field: Cluster-specific (0x01)
        .... ..01 = Frame Type: Cluster-specific (0x1)
        .... .0.. = Manufacturer Specific: False
        .... 0... = Direction: Client to Server
        ...0 .... = Disable Default Response: False
    Sequence Number: 74
    Command: Unknown (0xfd)
    Payload

ZigBee Network Layer Data, Dst: 0xc29e, Src: 0x0000
ZigBee Application Support Layer Data, Dst Endpt: 2, Src Endpt: 2
ZigBee Cluster Library Frame, Command: Default Response, Seq: 74
    Frame Control Field: Profile-wide (0x18)
        .... ..00 = Frame Type: Profile-wide (0x0)
        .... .0.. = Manufacturer Specific: False
        .... 1... = Direction: Server to Client
        ...1 .... = Disable Default Response: True
    Sequence Number: 74
    Command: Default Response (0x0b)
    Response to Command: 0xfd
    Status: Success (0x00)

I think we have finding the 10 bugs / bad / not working faulty things in this device but perhaps we is finding more in the future.

javicalle commented 2 years ago

Without having looked in depth, I think the main handicap will be managing the Direction: Client to Server part. If I'm not mistaken, the implementation is prepared to handle easily the Disable Default Response: True/False part. @MattWestb the question that arises is: do you know if this modification should be done depending on the endpoint/cluster? (you have commented that the answer was for EP2-4) Or would it be the same for all device commands? Obviously, if the management has to be by endpoint or even by command, it would be much more complicated.

Adminiuga commented 2 years ago

For the battery report, should there be a cluster id? What type of cluster is it on the device itself?

MattWestb commented 2 years ago

Our beloved TS004F DMS is having EP1 as one "normal" Zigbee light controller with extra "scene" commands and its working OK also the battery reporting and so on from EP1. And the "new" EPs (2-4) is having the wrong implanted DR direction. I think its only this or only few device that is having this "thing" and i was thinking using the function only on the "new" EPs by one new class for the cluster with tuya "scene" commands and leaving the "normal" for EP1. So getting the function / cluster i think the best, then can being used more free and flexible.

For my is no big problem but users that is using the device for "scenes" / events in automatons is getting no double events then the quirk is filtering them out but is getting 3 seconds delay for events from EP2-4 after the first key pressing and many users like have 12 commands playing with (it cant being easy keeping all in the head but they like it and not my).

I think the other TS004X devices need the same but i dont have the device for testing.

@Adminiuga The power config cluster is on the normal EP1 and its also having one "hybrid" OnOff cluster as in and out but in is deleted in the quirk replace. EP2-4 is also OnOff cluster but is only sending "tuya scene" commands and no ZCL commands (they is hidden EPs if asking the device for active endpoints is saying only EP1 so i cant saying 110%. The current device class in the quirk: https://github.com/zigpy/zha-device-handlers/blob/bbc69c05d25e0858613a9447d695e6f13b8e7a68/zhaquirks/tuya/ts004f.py#L180.

Adminiuga commented 2 years ago

Wireshark does not tell you the cluster?

MattWestb commented 2 years ago

EP1 cluster 0x0001 and attribute 0x0a and is posted in the first posting (i think i have putting in 2 times by mistake).

MattWestb commented 2 years ago

ZHA log:

2022-03-31 22:03:29 DEBUG (MainThread) [zigpy.zcl] [0xc29e:1:0x0001] ZCL deserialize: <ZCLHeader frame_control=<FrameControl frame_type=GLOBAL_COMMAND manufacturer_specific=False is_reply=True disable_default_response=False> manufacturer=None tsn=89 command_id=Command.Report_Attributes>
2022-03-31 22:03:29 DEBUG (MainThread) [zigpy.zcl] [0xc29e:1:0x0001] ZCL request 0x000a: [[Attribute(attrid=33, value=<TypeValue type=uint8_t, value=188>)]]
2022-03-31 22:03:29 DEBUG (MainThread) [zigpy.zcl] [0xc29e:1:0x0001] Attribute report received: battery_percentage_remaining=188

= OK

github-actions[bot] commented 1 year ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

MattWestb commented 1 year ago

Thanks J & J !!

javicalle commented 1 year ago

Another round to check the repeated responses.

If I have understood correctly the device expects the response to the commands to have the direction reversed. I think it would be possible to override it as follows (just added the command function):

TuyaSmartRemoteOnOffCluster ```python class TuyaSmartRemoteOnOffCluster(OnOff, EventableCluster): """TuyaSmartRemoteOnOffCluster: fire events corresponding to press type.""" rotate_type = { 0x00: RIGHT, 0x01: LEFT, 0x02: STOP, } press_type = { 0x00: SHORT_PRESS, 0x01: DOUBLE_PRESS, 0x02: LONG_PRESS, } name = "TS004X_cluster" ep_attribute = "TS004X_cluster" attributes = OnOff.attributes.copy() attributes.update({0x8001: ("backlight_mode", SwitchBackLight)}) attributes.update({0x8002: ("power_on_state", PowerOnState)}) attributes.update({0x8004: ("switch_mode", SwitchMode)}) def __init__(self, *args, **kwargs): """Init.""" self.last_tsn = -1 super().__init__(*args, **kwargs) server_commands = OnOff.server_commands.copy() server_commands.update( { 0xFC: foundation.ZCLCommandDef( "rotate_type", {"rotate_type": t.uint8_t}, False, is_manufacturer_specific=True, ), 0xFD: foundation.ZCLCommandDef( "press_type", {"press_type": t.uint8_t}, False, is_manufacturer_specific=True, ), } ) def handle_cluster_request( self, hdr: foundation.ZCLHeader, args: List[Any], *, dst_addressing: Optional[ Union[t.Addressing.Group, t.Addressing.IEEE, t.Addressing.NWK] ] = None, ): """Handle press_types command.""" # normally if default response sent, TS004x wouldn't send such repeated zclframe (with same sequence number), # but for stability reasons (e. g. the case the response doesn't arrive the device), we can simply ignore it if hdr.tsn == self.last_tsn: _LOGGER.debug("TS004X: ignoring duplicate frame") return # save last sequence number self.last_tsn = hdr.tsn # send default response (as soon as possible), so avoid repeated zclframe from device if not hdr.frame_control.disable_default_response: self.debug("TS004X: send default response") self.send_default_rsp(hdr, status=foundation.Status.SUCCESS) # handle command if hdr.command_id == 0xFC: rotate_type = args[0] self.listener_event( ZHA_SEND_EVENT, self.rotate_type.get(rotate_type, "unknown"), [] ) elif hdr.command_id == 0xFD: press_type = args[0] self.listener_event( ZHA_SEND_EVENT, self.press_type.get(press_type, "unknown"), [] ) async def command( self, command_id: Union[foundation.GeneralCommand, int, t.uint8_t], *args, manufacturer: Optional[Union[int, t.uint16_t]] = None, expect_reply: bool = True, tsn: Optional[Union[int, t.uint8_t]] = None, **kwargs: Any, ): """Override the default Cluster command.""" self.debug( "Inverting response direction. Cluster Command is %x, Arguments are %s - %s", command_id, args, kwargs, ) def_response = await super().command( command_id, *args, manufacturer, expect_reply, tsn, **kwargs ) self.debug( "Command response: %s", def_response, ) def_response.direction=foundation.Direction.Server_to_Client return def_response ``` Totally untested and just trying things. If needed we can also overwrite the `expect_reply` variable.
javicalle commented 1 year ago

I know you are doing a lots of tests and changes and I don't want to generate any additional noise to them. There is no hurry, but when you test if you can get logs from the retries I will take another look. I believe that the current code have a couple more of interesting debug logs.

MattWestb commented 1 year ago

I still is getting the Direction: Server to Client = 1 in wirwsharl and in the log :-(( home-assistant_2023-03-26T08-50-38.989Z.log

Is it possible doing it the the opposite direction with Client_to_Serve the the flag is saying true and we need it being set to false or is it one flag with false ?

Thanks J !!

javicalle commented 1 year ago

Ummm, so the problem is not in the command part. (I'm having a really hard time understanding all the messaging part)

Let's attack the reporting part. We can try to invert the direction before sending to ZHA, expecting that the DR will also be inverted. We can remove the command function and update the send_default_rsp this way:

class TuyaSmartRemoteOnOffCluster(OnOff, EventableCluster):
    """TuyaSmartRemoteOnOffCluster: fire events corresponding to press type."""

    rotate_type = {
        0x00: RIGHT,
        0x01: LEFT,
        0x02: STOP,
    }
    press_type = {
        0x00: SHORT_PRESS,
        0x01: DOUBLE_PRESS,
        0x02: LONG_PRESS,
    }
    name = "TS004X_cluster"
    ep_attribute = "TS004X_cluster"
    attributes = OnOff.attributes.copy()
    attributes.update({0x8001: ("backlight_mode", SwitchBackLight)})
    attributes.update({0x8002: ("power_on_state", PowerOnState)})
    attributes.update({0x8004: ("switch_mode", SwitchMode)})

    def __init__(self, *args, **kwargs):
        """Init."""
        self.last_tsn = -1
        super().__init__(*args, **kwargs)

    server_commands = OnOff.server_commands.copy()
    server_commands.update(
        {
            0xFC: foundation.ZCLCommandDef(
                "rotate_type",
                {"rotate_type": t.uint8_t},
                False,
                is_manufacturer_specific=True,
            ),
            0xFD: foundation.ZCLCommandDef(
                "press_type",
                {"press_type": t.uint8_t},
                False,
                is_manufacturer_specific=True,
            ),
        }
    )

    def handle_cluster_request(
        self,
        hdr: foundation.ZCLHeader,
        args: List[Any],
        *,
        dst_addressing: Optional[
            Union[t.Addressing.Group, t.Addressing.IEEE, t.Addressing.NWK]
        ] = None,
    ):
        """Handle press_types command."""
        # normally if default response sent, TS004x wouldn't send such repeated zclframe (with same sequence number),
        # but for stability reasons (e. g. the case the response doesn't arrive the device), we can simply ignore it
        if hdr.tsn == self.last_tsn:
            _LOGGER.debug("TS004X: ignoring duplicate frame")
            return
        # save last sequence number
        self.last_tsn = hdr.tsn

        # send default response (as soon as possible), so avoid repeated zclframe from device
        if not hdr.frame_control.disable_default_response:
            self.debug("TS004X: send default response")
            hdr.frame_control.direction=foundation.Direction.Client_to_Server
            self.send_default_rsp(hdr, status=foundation.Status.SUCCESS)
        # handle command
        if hdr.command_id == 0xFC:
            rotate_type = args[0]
            self.listener_event(
                ZHA_SEND_EVENT, self.rotate_type.get(rotate_type, "unknown"), []
            )
        elif hdr.command_id == 0xFD:
            press_type = args[0]
            self.listener_event(
                ZHA_SEND_EVENT, self.press_type.get(press_type, "unknown"), []
            )

If I have understood correctly, device is sending reports in the wrong direction (but ZHA is accepting). The side effect is that the DR from ZHA is inverted too and device is not liking it. Is it correct? That implementation will send all the reports as foundation.Direction.Client_to_Server. Please tell me if that must be handled by command or similar.

MattWestb commented 1 year ago

Sorry i was decalking my kaffe machines and was getting water in the power connection and the apartment was shutting down the power so have running around and doing all clocks one more time and helping some Zigbee devices that was having problems (only 3 of round 100). This 2 lines is the interesting:

2023-03-26 10:49:51.877 DEBUG (MainThread) [zigpy.zcl] [0x6E40:3:0x0006] Sending reply header: ZCLHeader(frame_control=FrameControl(frame_type=<FrameType.GLOBAL_COMMAND: 0>, is_manufacturer_specific=False, direction=<Direction.Client_to_Server: 1>, disable_default_response=1, reserved=0, *is_cluster=False, *is_general=True, *is_reply=True), tsn=118, command_id=<GeneralCommand.Default_Response: 11>, *direction=<Direction.Client_to_Server: 1>, *is_reply=True)
2023-03-26 10:49:51.880 DEBUG (MainThread) [zigpy.zcl] [0x6E40:3:0x0006] Sending reply: Default_Response(command_id=253, status=<Status.SUCCESS: 0>)

And this is that parameter that need being 0 and not 1 *direction=<Direction.Client_to_Server: 1>.

I have not testing your last code then i need doing changes in the HA container and its taking little time doing right and not braking HA (if copy the INIT to the wrong folder).

I doing it later or tomorrow and reporting back :-)))

javicalle commented 1 year ago

No rush at all. (Too many open fronts...)

No easy way to invert the DR direction, so my hope is that inverting the direction in the device report will also invert the DR from ZHA. That makes sense to you?

MattWestb commented 1 year ago

In the network layer yes but i dont understanding the coding part but i testing it and reporting. I shall also taking one closer look then doing it in the sniff its being right in both ends so i doing saying wrong things.

github-actions[bot] commented 9 months ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

MattWestb commented 9 months ago

tuya TS004X devices is needing it for not repeating sending commands and also blocking fast sending presses then the Zigbee stack is waiting for one default response but in the wrong direction. Some issues is depending on this function for getting the device workiing OK.

github-actions[bot] commented 3 months ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

MattWestb commented 3 months ago

I think some work is being done but its not tested on tuya TS004X devices so still wonted for this devises.