home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
71.59k stars 29.91k forks source link

Message send failure with YRD226 TSDB as a child of a router #44935

Closed mguaylam closed 3 years ago

mguaylam commented 3 years ago

The problem

When my Yale YRD226 TSDB connect thru a router (Philips Hue or IKEA Tradfri bulbs) instead of the coordinator directly, some lock / unlock commands fails. This issue does not happen when the door lock is directly connected to the coordinator.

The issue has been replicated with both IKEA Tradfri White Spectrum and Philips Hue White ambiance bulbs (tried the Philips Hue in the hope it would solve the problem)

The problem is intermitent. It works for a while and other times it get stucked for a short while.

Thank you in advance for your great work and help!

Environment

Problem-relevant device signature

{
  "node_descriptor": "NodeDescriptor(byte1=2, byte2=64, mac_capability_flags=128, manufacturer_code=4125, maximum_buffer_size=82, maximum_incoming_transfer_size=255, server_mask=0, maximum_outgoing_transfer_size=255, descriptor_capability_field=0)",
  "endpoints": {
    "1": {
      "profile_id": 260,
      "device_type": "0x000a",
      "in_clusters": [
        "0x0000",
        "0x0001",
        "0x0003",
        "0x0009",
        "0x000a",
        "0x0020",
        "0x0101",
        "0x0b05"
      ],
      "out_clusters": [
        "0x000a",
        "0x0019"
      ]
    }
  },
  "manufacturer": "Yale",
  "model": "YRD226 TSDB",
  "class": "zigpy.device.Device"
}
{
  "node_descriptor": "NodeDescriptor(byte1=1, byte2=64, mac_capability_flags=142, manufacturer_code=4107, maximum_buffer_size=82, maximum_incoming_transfer_size=128, server_mask=11264, maximum_outgoing_transfer_size=128, descriptor_capability_field=0)",
  "endpoints": {
    "11": {
      "profile_id": 260,
      "device_type": "0x010c",
      "in_clusters": [
        "0x0000",
        "0x0003",
        "0x0004",
        "0x0005",
        "0x0006",
        "0x0008",
        "0x0300",
        "0x1000",
        "0xfc02"
      ],
      "out_clusters": [
        "0x0019"
      ]
    },
    "242": {
      "profile_id": 41440,
      "device_type": "0x0061",
      "in_clusters": [],
      "out_clusters": [
        "0x0021"
      ]
    }
  },
  "manufacturer": "Philips",
  "model": "LTA003",
  "class": "zigpy.device.Device"
}

image

Problem-relevant configuration.yaml

# Configuration pannel
config:
# Logger
logger:
  default: info
  logs:
    homeassistant.core: debug
    homeassistant.components.zha: debug
    bellows.zigbee.application: debug
    bellows.ezsp: debug
    zigpy: debug
    zigpy_cc: debug
    zigpy_deconz.zigbee.application: debug
    zigpy_deconz.api: debug
    zigpy_xbee.zigbee.application: debug
    zigpy_xbee.api: debug
    zigpy_zigate: debug
    zigpy_znp: debug
    zhaquirks: debug
# Frontend
frontend:
  themes: !include_dir_merge_named themes
# History
history:
  use_include_order: true
  include:
    entities:
      - lock.porte_appartement
      - binary_sensor.porte_appartement
      - binary_sensor.porte_patio
      - binary_sensor.mouvement_salle_a_manger
      - binary_sensor.mouvement_entree
    domains:
      - climate
# Input Boolean
input_boolean:
# Input Datetime
input_datetime:
# Input Number
input_number:
# Input Select
input_select:
# Input Text
input_text:
# Logbook
logbook:
  include:
    entities:
      - lock.porte_appartement
      - binary_sensor.porte_appartement
      - binary_sensor.porte_patio
    domains:
      - climate
# Map
map:
# Mobile App
mobile_app:
# Person
person:
# Simple Service Discovery Protocol (SSDP)
ssdp:
# Sun
sun:
# System Health
system_health:
# Zero-configuration networking (zeroconf) 
zeroconf:
# Zone
zone:
# Text to speech
tts:
  - platform: google_translate
    language: 'fr'
# Configuration files
group: !include groups.yaml
automation: !include automations.yaml
script: !include scripts.yaml
scene: !include scenes.yaml
sensor: !include sensors.yaml
homeassistant:
  customize: !include customize.yaml
  packages: !include_dir_named packages
# Home Assistant SSL
http:
  ssl_certificate: ---
  ssl_key: ---
# Environment Canada weather
weather:
  - platform: environment_canada
    station: QC/s0000635
# Locks
lock:
  # Lock template building
  - platform: template
    name: Porte Édifice
    value_template: "{{ is_state('switch.verrou_porte_edifice', 'off') }}"
    lock:
      service: switch.turn_off
      data:
        entity_id: switch.verrou_porte_edifice
    unlock:
      service: switch.turn_on
      data:
        entity_id: switch.verrou_porte_edifice
# Plant configuration file
plant: !include plant.yaml
# Nest integration
nest:
  client_id: ---
  client_secret: ---
# WOL integration
wake_on_lan:
switch:
  # Flux integration
  # Bedroom lights Flux
  - platform: flux
    name: Autopilote lumières chambre
    lights:
      - light.lampe_chevet
      - light.lampe_plancher
    mode: mired
    start_time: '7:00'
    stop_time: '22:00'
    start_colortemp: 4000
    sunset_colortemp: 3600
    stop_colortemp: 2200
    disable_brightness_adjust: true
  # Secret room light Flux
  - platform: flux
    name: Autopilote lumière pièce secrète
    lights:
      - light.lampe_secrete
    mode: mired
    start_time: '7:00'
    stop_time: '22:00'
    start_colortemp: 2700
    sunset_colortemp: 2600
    stop_colortemp: 2200
    disable_brightness_adjust: true
  # Salon light Flux
  - platform: flux
    name: Autopilote lumières salon
    lights:
      - light.lampe_rouge
      - light.lampe_jaune
    mode: mired
    start_time: '7:00'
    stop_time: '22:00'
    start_colortemp: 4000
    sunset_colortemp: 3600
    stop_colortemp: 2200
    disable_brightness_adjust: true
  # Salle à manger light Flux
  - platform: flux
    name: Autopilote lumières salle à manger
    lights:
      - light.luminaire_salle_a_manger
    mode: mired
    start_time: '7:00'
    stop_time: '22:00'
    start_colortemp: 4000
    sunset_colortemp: 3000
    stop_colortemp: 2200
    disable_brightness_adjust: true
# Light groups
light:
  - platform: group
    name: Luminaire salle à manger
    entities:
      - light.luminaire_salle_manger_1
      - light.luminaire_salle_manger_2
      - light.luminaire_salle_manger_3
# Climate
climate:
  # Office thermostat
  - platform: generic_thermostat
    name: Climatiseur bureau
    heater: switch.climatiseur_bureau
    target_sensor: sensor.xiaomi_cgg1_temperature
    ac_mode: true
  # Living room thermostat
  - platform: generic_thermostat
    name: Climatiseur salon
    heater: switch.climatiseur_salon
    target_sensor: sensor.temperature_xiaomi_cgg1_corridor
    ac_mode: true
# ZHA
#zha:
#  zigpy_config:
#    ota:
#      ikea_provider: true
# Alarm
alarm_control_panel:
  - platform: manual
    name: Alarme --- Rue ------
    arming_time: 30
    delay_time: 0
    trigger_time: 60
    disarmed:
      trigger_time: 0
    armed_home:
      arming_time: 0
      delay_time: 0
    armed_night:
      arming_time: 0
      delay_time: 0
# APCUPSD
apcupsd:
# Alerts
alert:
  lave_vaiselle:
    name: Humidité lave-vaiselle
    message: De l’humidité à été détectée sous le lave-vaiselle.
    done_message: L’humidité sous le lave-vaiselle s’est dissipée.
    entity_id: binary_sensor.humidite_lave_vaiselle
    state: 'on'
    repeat:
      - 5
      - 15
      - 15
      - 15
      - 30
    notifiers:
      - notify
    data:
      push:
        sound:
          critical: 1
          name: default
          volume: 1
    title: Dégât d’eau
  laveuse_a_linge:
    name: Humidité machine à laver
    message: De l’humidité à été détectée sous la machine à laver.
    done_message: L’humidité sous la machine à laver s’est dissipée.
    entity_id: binary_sensor.humidite_machine_a_laver
    state: 'on'
    repeat:
      - 5
      - 15
      - 15
      - 15
      - 30
    notifiers:
      - notify
    data:
      push:
        sound:
          critical: 1
          name: default
          volume: 1
    title: Dégât d’eau
# Media players
media_player:
  # BOSE
  - platform: soundtouch
    host: ---
    port: 8090
    name: Soundtouch bureau

Traceback/Error logs

Full log : HALog.zip

2021-01-07 18:15:07 DEBUG (MainThread) [homeassistant.core] Bus:Handling <Event service_registered[L]: domain=logger, service=set_default_level>
2021-01-07 18:15:07 DEBUG (MainThread) [homeassistant.core] Bus:Handling <Event service_registered[L]: domain=logger, service=set_level>
2021-01-07 18:15:07 INFO (MainThread) [homeassistant.setup] Setup of domain logger took 0.0 seconds
2021-01-07 18:15:08 INFO (MainThread) [homeassistant.setup] Setting up recorder
2021-01-07 18:15:08 DEBUG (MainThread) [homeassistant.core] Bus:Handling <Event service_registered[L]: domain=recorder, service=purge>
2021-01-07 18:15:08 DEBUG (MainThread) [homeassistant.core] Bus:Handling <Event component_loaded[L]: component=logger>
2021-01-07 18:15:08 INFO (MainThread) [homeassistant.setup] Setting up http
2021-01-07 18:15:08 INFO (MainThread) [homeassistant.setup] Setup of domain http took 0.0 seconds
2021-01-07 18:15:08 INFO (MainThread) [homeassistant.setup] Setup of domain recorder took 0.2 seconds
2021-01-07 18:15:08 DEBUG (MainThread) [homeassistant.core] Bus:Handling <Event component_loaded[L]: component=http>
2021-01-07 18:15:08 DEBUG (MainThread) [homeassistant.core] Bus:Handling <Event component_loaded[L]: component=recorder>
2021-01-07 18:15:08 INFO (MainThread) [homeassistant.setup] Setting up system_log
2021-01-07 18:15:08 DEBUG (MainThread) [homeassistant.core] Bus:Handling <Event service_registered[L]: domain=system_log, service=clear>

<---------------------------Trunkated, see full log here : https://github.com/home-assistant/core/files/5784962/HALog.zip--------------------------->

2021-01-07 20:33:34 DEBUG (MainThread) [zigpy.zcl] [0xb9bb:11:0x0008] Attribute report received: current_level=153
2021-01-07 20:33:34 DEBUG (MainThread) [homeassistant.components.zha.core.channels.base] [0xB9BB:11:0x0008]: received attribute: 0 update with value: 153
2021-01-07 20:33:34 DEBUG (MainThread) [bellows.ezsp.protocol] Application frame 89 (incomingRouteRecordHandler) received: b'9bdd8f97520901881700ffc700'
2021-01-07 20:33:34 DEBUG (MainThread) [bellows.zigbee.application] Received incomingRouteRecordHandler frame with [0xdd9b, 00:17:88:01:09:52:97:8f, 255, -57, []]
2021-01-07 20:33:34 DEBUG (MainThread) [bellows.zigbee.application] Processing route record request: (0xdd9b, 00:17:88:01:09:52:97:8f, 255, -57, [])
2021-01-07 20:33:34 DEBUG (MainThread) [bellows.ezsp.protocol] Application frame 69 (incomingMessageHandler) received: b'00040108000b010001000075ffc79bddffff0718f10a0000209902'
2021-01-07 20:33:34 DEBUG (MainThread) [bellows.zigbee.application] Received incomingMessageHandler frame with [<EmberIncomingMessageType.INCOMING_UNICAST: 0>, EmberApsFrame(profileId=260, clusterId=8, sourceEndpoint=11, destinationEndpoint=1, options=<EmberApsOption.APS_OPTION_ENABLE_ROUTE_DISCOVERY: 256>, groupId=0, sequence=117), 255, -57, 0xdd9b, 255, 255, b'\x18\xf1\n\x00\x00 \x99']
2021-01-07 20:33:34 DEBUG (MainThread) [zigpy.zcl] [0xdd9b:11:0x0008] ZCL deserialize: <ZCLHeader frame_control=<FrameControl frame_type=GLOBAL_COMMAND manufacturer_specific=False is_reply=False disable_default_response=True> manufacturer=None tsn=241 command_id=Command.Report_Attributes>
2021-01-07 20:33:34 DEBUG (MainThread) [zigpy.zcl] [0xdd9b:11:0x0008] ZCL request 0x000a: [[Attribute(attrid=0, value=<TypeValue type=uint8_t, value=153>)]]
2021-01-07 20:33:34 DEBUG (MainThread) [zigpy.zcl] [0xdd9b:11:0x0008] Attribute report received: current_level=153
2021-01-07 20:33:34 DEBUG (MainThread) [homeassistant.components.zha.core.channels.base] [0xDD9B:11:0x0008]: received attribute: 0 update with value: 153
2021-01-07 20:33:34 DEBUG (MainThread) [bellows.ezsp.protocol] Send command nop: ()
2021-01-07 20:33:34 DEBUG (MainThread) [bellows.ezsp.protocol] Application frame 5 (nop) received: b''
2021-01-07 20:33:37 DEBUG (MainThread) [homeassistant.core] Bus:Handling <Event state_changed[L]: entity_id=sensor.lumiere_phoenix_roebelenii, old_state=<state sensor.lumiere_phoenix_roebelenii=882; unit_of_measurement=lx, friendly_name=Lumière Phoenix roebelenii, icon=mdi:brightness-5 @ 2021-01-07T20:33:04.149597-05:00>, new_state=<state sensor.lumiere_phoenix_roebelenii=886; unit_of_measurement=lx, friendly_name=Lumière Phoenix roebelenii, icon=mdi:brightness-5 @ 2021-01-07T20:33:37.138999-05:00>>
2021-01-07 20:33:37 DEBUG (MainThread) [homeassistant.core] Bus:Handling <Event state_changed[L]: entity_id=plant.phoenix_roebelenii, old_state=<state plant.phoenix_roebelenii=problem; problem=moisture low, conductivity low, sensors=moisture=sensor.humidite_phoenix_roebelenii, temperature=sensor.temperature_phoenix_roebelenii, conductivity=sensor.conductivite_phoenix_roebelenii, brightness=sensor.lumiere_phoenix_roebelenii, unit_of_measurement_dict=temperature=°C, moisture=%, brightness=lx, conductivity=µS/cm, moisture=0, temperature=21.3, conductivity=0, brightness=882, max_brightness=935, friendly_name=Phoenix roebelenii @ 2021-01-07T18:15:31.816012-05:00>, new_state=<state plant.phoenix_roebelenii=problem; problem=moisture low, conductivity low, sensors=moisture=sensor.humidite_phoenix_roebelenii, temperature=sensor.temperature_phoenix_roebelenii, conductivity=sensor.conductivite_phoenix_roebelenii, brightness=sensor.lumiere_phoenix_roebelenii, unit_of_measurement_dict=temperature=°C, moisture=%, brightness=lx, conductivity=µS/cm, moisture=0, temperature=21.3, conductivity=0, brightness=886, max_brightness=935, friendly_name=Phoenix roebelenii @ 2021-01-07T18:15:31.816012-05:00>>
2021-01-07 20:33:44 DEBUG (MainThread) [bellows.ezsp.protocol] Send command nop: ()
2021-01-07 20:33:44 DEBUG (MainThread) [bellows.ezsp.protocol] Application frame 5 (nop) received: b''

Additional information

code-in-progress commented 3 years ago

@dmulcahey I also have seen the same issue, although only when routing through various repeaters. When the lock connects directly to the coordinator (NORTEK stick), it works perfectly, but when going through a repeater (in my case, SmartThings smart plugs and/or Ikea signal repeaters), the message fails getting to the lock.

mguaylam commented 3 years ago

I can provide a Wireshark dump as well if needed (in private). 🙂

Adminiuga commented 3 years ago

In the log there was only one delivery failure for the log out of 16 attempts. And judging by the time it took - 5 seconds, apparently the failure came from the network. I don't really know what could be done here, ZHA sends the requests, stick sends out the request, if it fails, then if fails.

2021-01-07 20:16:43 DEBUG (MainThread) [bellows.ezsp.protocol] Send command sendUnicast: (<EmberOutgoingMessageType.
OUTGOING_DIRECT: 0>, 0xBB33, EmberApsFrame(profileId=260, clusterId=257, sourceEndpoint=1, destinationEndpoint=1, op
tions=<EmberApsOption.APS_OPTION_ENABLE_ROUTE_DISCOVERY|APS_OPTION_RETRY: 320>, groupId=0, sequence=244), 245, b'\x0
1\xf4\x01')
2021-01-07 20:16:43 DEBUG (MainThread) [bellows.ezsp.protocol] Application frame 52 (sendUnicast) received: b'00a4'
2021-01-07 20:16:48 DEBUG (MainThread) [bellows.ezsp.protocol] Application frame 63 (messageSentHandler) received: b
'0033bb04010101010140010000a4f56600'
2021-01-07 20:16:48 DEBUG (MainThread) [bellows.zigbee.application] Received messageSentHandler frame with [<EmberOu
tgoingMessageType.OUTGOING_DIRECT: 0>, 47923, EmberApsFrame(profileId=260, clusterId=257, sourceEndpoint=1, destinat
ionEndpoint=1, options=<EmberApsOption.APS_OPTION_ENABLE_ROUTE_DISCOVERY|APS_OPTION_RETRY: 320>, groupId=0, sequence
=164), 245, <EmberStatus.DELIVERY_FAILED: 102>, b'']
2021-01-07 20:16:48 DEBUG (MainThread) [zigpy.device] [0xbb33] Delivery error for seq # 0xf4, on endpoint id 1 clust
er 0x0101: message send failure
Adminiuga commented 3 years ago

I can provide a Wireshark dump as well if needed (in private). slightly_smiling_face

@mguaylambert can you compare the difference in the failed requests and the one which worked? Were do they fail? do they reach the parent device? Does the lock polls for that message?

I feel like zha does everything possible: it does extend the timeout and waits for the reply for 30s, but it fails much faster than that, meaning network knows there was a failure.

BTW, for nortek, try enabling the source routing and check if it makes a difference. To enable, add the following in configuration.yaml

zha:
  zigpy_config:
    source_routing: true
    ezsp_config:
      CONFIG_SOURCE_ROUTE_TABLE_SIZE: 32
mguaylam commented 3 years ago

Thank you so much for looking into this @Adminiuga !

I did add the source_routing to my configuration prior to analyzing the network. After a couple days of testing, it doesn't seem to change the success / failure rate.

After observing the behaviour of the network, I have gathered 3 examples. Two of them failing, one of them working. The difference between them to me look to be : when the stick sends two unlock commands back to back, the lock seems to suffer from an internal failure (it says the unlock command was successful but didn’t unlock). When ZHA sends only one, everything works.

Addresses in question

  1. 0x0000 : Nortek - HUSBZB-1 (coordinator)
  2. 0x88cc : Philips Hue - LTA003 (parent)
  3. 0xbb33 : Yale lock - YRD226 TSDB (child)

Failure 1

You can see from this file, Failure-1.txt , ZCL Door Lock: Unlock Door, Seq: 102 got sent twice on the network and the lock ended-up receiving it twice as a consequence. The lock correctly polls the data. Also, the lock finishes by sending a ZCL Door Lock: Unlock Door Response, Seq: 102 twice but fails to unlock from it. This is why it does not report any unlock attribute since it physically didn’t unlock. Here is the log from Home-Assistant at the same time frame : Failure-1.log

Failure 2

Here we can observe the same pattern as Failure-1 : Failure-2.txt With its associated Home-Assistant log : Failure-2.log

Success 1

Here things are different (and they succeeded). ZCL Door Lock: Unlock Door, Seq: 203 is sent only once over the network, the lock polls the data and the lock physically unlock. We can also see that it reports its new attributes : ZCL: Report Attributes, Seq: 132 (unlocked) to HA. One frame is missing from my capture (ZCL: Report Attributes, Seq: 132) (Success-1.txt) but everything suggests that it did reach the coordinator. And here’s the associated log file : Success-1.log

Conclusion

So, I am wondering if the lock is not faulty here (or the module) since I guess receiving twice the same command should not change much in the process (should the stick sends twice the same command?).. Also, unfortunately, it does not explain yet why sometimes I see some message send failure (I’ve yet to see those which is surprising tbh.).

Thank you for any advice and your valuable time!

MattWestb commented 3 years ago

@Adminiuga How is it with the application layer ack ? If not getting one ack the application should resending the same message if its being done with unicast but i cant see how its being dont in the logs :-(

Adminiuga commented 3 years ago

Yeah, there's definitely some strangeness going on in the capture, but I can't quite make it out, without expanding the network layer and frankly I don't dig that deep often, so a bit fuzzy on this. But it does seem like there way to many commands sent, some duplication is normal when the sniffer hears all the retransmissions relayed messages, but then, when the Unlock door response is being sent, there way too many APS acks bouncing around, aren't there?

What firmware is running on Ikea? Is it updated to the latest ZB3 firmware?

mguaylam commented 3 years ago

@Adminiuga : Yeah actually in all 3 cases (both failures and successes), we get DEBUG (MainThread) [homeassistant.components.zha.core.channels.base] [0xBB33:1:0x0101]: executed 'unlock_door' command with args: '()' kwargs: '{}' result: [<Status.SUCCESS: 0>] and that’s because the lock seems to lie about it. Making me wonder if the problem is not partially with the lock itself. I am thinking about reaching Yale about it and show them that the lock says unlock_door = success without really doing it when it receives 2 unlock commands back to back. One thing I wonder is, why does the coordinator send twice : ZCL Door Lock: Unlock Door in the failed examples. Is it normal? And who’s responsible for doing that, ZigPy or the coordinator? Both are the same sequence. It’s almost like the coordinator did not catch the network ACK (but I did, which is weird because I'm very far from them) and it decided to re transmit it. And yet, shouldn't the Philips Hue bulb should discard it since it received the same message twice?

I am no expert in ZigBee and pardon me for that, I am learning steeply how it all works but I wonder, would prevent the [stick or ZigPy?] from transmitting twice ZCL Door Lock: Unlock Door, Seq: 252 solve the issue? Or should I look with Yale why their lock is behaving that way?

As for the APS: Ack, Dst Endpt: 1, Src Endpt: 1 it looks good to me considering that there is a bulb relaying and since the stick sent ZCL Door Lock: Unlock Door, Seq: 252 twice, it is just replying to both commands. You can make the difference who is sending or relaying by looking at the IEEE 802.15.4 Data, Dst: 0x88cc, Src: 0x0000 (coordinator to bulbs) and IEEE 802.15.4 Data, Dst: 0xbb33, Src: 0x88cc (bulbs to lock).

As for firmwares, all my devices as of today, they look up to date. The IKEA bulbs are on 2.0.029 which I believe is ZigBee 3.0 and the Philips Hue on which the lock is a child of is is on 1.65.11_hB798F2B which I believe is the latest firmware (I update them via Bluetooth).

I can always send you the Wireshark captures if you want (filtered) but that would be in private since I need to give you my security key too.

Adminiuga commented 3 years ago

@Adminiuga : Yeah actually in all 3 cases (both failures and successes), we get DEBUG (MainThread) [homeassistant.components.zha.core.channels.base] [0xBB33:1:0x0101]: executed 'unlock_door' command with args: '()' kwargs: '{}' result: [<Status.SUCCESS: 0>] and that’s because the lock seems to lie about it. Making me wonder if the problem is not partially with the lock itself. I am thinking about reaching Yale about it and show them that the lock says unlock_door = success

You do get:

ZigBee Cluster Library Frame
    Frame Control Field: Cluster-specific (0x19)
    Sequence Number: 102
    Command: Unlock Door Response (0x01)
    Payload
        Lock Status: Success (0x00)

Door lock response from the lock and it success. If lock reports it, but doesn't do anything, then it could be the problem with the lock.

Have you tried replacing the fresh batteries and checking that lock is not "binding" physically, i.e. it tries to lock/unlock but can't move physically?

MattWestb commented 3 years ago

If not finding anything else in the logs try changing the locks parent so its not using the Philips HUE bulb like putting one IKEA bulb or plug in the near and from the routers card "adding devices via this device" and resetting the lock for getting it rejoin the network. I can being that the philips bulb is not releasing the lock then power it down.

Sniff logs its great than all information is there but not nice sharing the network key on the open forums :-((

mguaylam commented 3 years ago

@Adminiuga : Exactly. The lock sends twice : Lock Status: Success (0x00) where it actually does nothing physically, not even tries to do anything actually (no sound, no movement). I’ve inspected it, the lock can move freely without any friction. Also, I would have expected the lock to report an attribute of being blocked physically if it was the case but I did not see anything about it in the capture. I will try a fresh set of batteries to see but the ones in it we’re the ones that came with it and the lock is fairly new. Still reporting 80% of battery.

@MattWestb : that’s the strangest part. When the lock is a child of the coordinator, it works perfectly all the time. Where when it is a child of a router (I tried both Philips Hue, IKEA bulbs & IKEA smart outlet) it fails fairly often. I used to just force the lock to go on the coordinator but it is not really realistic as it change it’s parent from time to time since there is closer devices to it.

And yeah, sharing the network key is not something I should do publicly indeed. One thing I could do is either output all the layers in the text file (heavy reading) or put it in a JSON.

MattWestb commented 3 years ago

I have reading of locks that dont like being on on router and only working then being direct connected to the coordinator bu i was not reading how it was working or not. My Philips HUE bulb is not deleting childs then they is changing router or leaving the network but its for the moment not in your case then you can see both received and transmitted messages from the lock.

The best if Adminiuga can taking one deep dive in the log perhaps hi can see somthing but its sounds little like the lock have some "spontaneous free days" then its like and you not.

MattWestb commented 3 years ago

Then looking on the map is the lock having very good signal to the Philips bulb.

@Adminiuga can it being missing ack in the application layer like the switch that you was merged earlier today ? The lock dont have any quirk but if its "good behaving".

Adminiuga commented 3 years ago

can it being missing ack

No. you don't send a response to a response.

MattWestb commented 3 years ago

True and the lock have replayed withLock Status: Success (0x00) :-((

mguaylam commented 3 years ago

@Adminiuga I was able to recreate a message send failure as well to conclude my findings but here’s the catch : I see [0xbb33] Extending timeout for 0x57 request in the Home Assistant logs but when looking at the frames , it took a little more time than usual but still only 6 seconds before the coordinator got a confirmation from the lock that it has received the command (The bulb was waiting for a data request from the lock and I guess the lock was waiting for the channel to be free.) The network was able to deliver every frame correctly and the lock got unlocked just fine.

I can see the coordinator sent 3 unlock commands initially (in less than 3 seconds) and got all passed to the bulb correctly meaning the two other commands were really unnecessary here. Why is the coordinator doing this? And why does it tell ZigPy that a failure happened (most of the time, even if it says there was a network failure, everything is fine).

In resume

It is clear to me the lock has a software issue. Considering this, I will reach Yale about it. Meanwhile including the previous questions, can we somehow mitigate the number of commands the coordinator spits on the network and why does the coordinator report a message send failure when the frames look all good? Does it mean it was not capable to place the frame on the network?

Thank you for your time and help, it is really appreciated.