Koenkk / zigbee2mqtt

Zigbee 🐝 to MQTT bridge 🌉, get rid of your proprietary Zigbee bridges 🔨
https://www.zigbee2mqtt.io
GNU General Public License v3.0
12.21k stars 1.68k forks source link

Iris (Centralite) 3320-L Dropping Off #16183

Closed shackrat closed 1 year ago

shackrat commented 1 year ago

What happened?

In the past few days several of the Iris 3320-L Contact Sensors have dropped off the network and no longer report. Pulling the battery to reset does not reconnect, even during inclusion.

When the device is reset and inclusion is active, the device will reconnect but will not report open/close (IAS) status. Other reports (temperature & battery) work as expected.

If I force-remove the device and pair as a new device it will reconnect and function properly.

If I shut down the container, edit the database and set interviewComplete = false, the device will re-interview after restarting Z2M and will function properly.

  1. The only way too re-enroll IAS is to reinterview the device. In the short term it would be nice to have a checkbox in the UI to force a re-interview.
  2. I see checkin interval is set to 60 minutes. For whatever reason ZHA is using 55 minutes. These worked for years with ZHA without any issues.
  3. Is it possible that Device Announcement messages could not be handled properly? If the device has changed its address it'll be marked offline and not be able to communicate. My experience with the older Centralite devices show these can generate frequent ZDO Device Announcements when they change addresses. This was a huge issue in the early days of ZHA but they've since fixed it.

What did you expect to happen?

I expect the device to remain on the network. These have been installed and in service since 2015.

How to reproduce it (minimal and precise)

Pair the device and wait a few days.

Zigbee2MQTT version

1.29.1

Adapter firmware version

20220219

Adapter

Sonoff ZBDongle-P

Debug log

Nothing for the device was reported in herdsman logs in the timespan I was able to capture. There are more than 100 devices so logs are huge.

shackrat commented 1 year ago

One of the same 3320-L's that dropped off 3 days ago has dropped off again. This time I was able to capture some logs.

zigbee2mqtt_ch12 | 2023-01-16T03:32:04.425Z zigbee-herdsman:adapter:zStack:znp:AREQ <-- AF - incomingMsg - {"groupid":0,"clusterid":32,"srcaddr":61138,"srcendpoint":1,"dstendpoint":1,"wasbroadcast":0,"linkquality":116,"securityuse":0,"time> zigbee2mqtt_ch12 | 2023-01-16T03:32:04.426Z zigbee-herdsman:controller:log Received 'zcl' data '{"frame":{"Header":{"frameControl":{"frameType":0,"manufacturerSpecific":false,"direction":1,"disableDefaultResponse":false,"reservedBits":0},"> zigbee2mqtt_ch12 | Zigbee2MQTT:info 2023-01-15 22:32:04: MQTT publish: topic 'zigbee2mqtt1/Office Window', payload '{"battery":27,"battery_low":false,"contact":true,"last_seen":"2023-01-15T22:32:04-05:00","linkquality":116,"tamper":true,"> zigbee2mqtt_ch12 | 2023-01-16T03:32:04.429Z zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext [] zigbee2mqtt_ch12 | Zigbee2MQTT:info 2023-01-15 22:32:04: MQTT publish: topic 'zigbee2mqtt1/Office Christmas Tree', payload '{"current":2.17,"last_seen":"2023-01-15T22:32:04-05:00","linkquality":58,"power":252.8,"state":"ON","voltage":116.> zigbee2mqtt_ch12 | 2023-01-16T03:32:04.432Z zigbee-herdsman:adapter:zStack:adapter sendZclFrameToEndpointInternal 0x000d6f000b6674bd:61138/1 (0,0,2) zigbee2mqtt_ch12 | 2023-01-16T03:32:04.435Z zigbee-herdsman:adapter:zStack:unpi:parser <-- [254,1,100,1,0,100] zigbee2mqtt_ch12 | 2023-01-16T03:32:04.435Z zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext [254,1,100,1,0,100] zigbee2mqtt_ch12 | 2023-01-16T03:32:04.435Z zigbee-herdsman:adapter:zStack:unpi:parser --> parsed 1 - 3 - 4 - 1 - [0] - 100 zigbee2mqtt_ch12 | 2023-01-16T03:32:04.435Z zigbee-herdsman:adapter:zStack:znp:SRSP <-- AF - dataRequest - {"status":0} zigbee2mqtt_ch12 | 2023-01-16T03:32:04.435Z zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext [] zigbee2mqtt_ch12 | 2023-01-16T03:32:04.435Z zigbee-herdsman:adapter:zStack:znp:SREQ --> AF - dataRequest - {"dstaddr":61138,"destendpoint":1,"srcendpoint":1,"clusterid":32,"transid":42,"options":0,"radius":30,"len":5,"data":{"type":"Buffer> zigbee2mqtt_ch12 | 2023-01-16T03:32:04.435Z zigbee-herdsman:adapter:zStack:unpi:writer --> frame [254,15,36,1,210,238,1,1,32,0,42,0,30,5,24,17,11,0,0,5] zigbee2mqtt_ch12 | 2023-01-16T03:32:04.446Z zigbee-herdsman:adapter:zStack:unpi:parser <-- [254,1,100,1,0,100] zigbee2mqtt_ch12 | 2023-01-16T03:32:04.446Z zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext [254,1,100,1,0,100] zigbee2mqtt_ch12 | 2023-01-16T03:32:04.447Z zigbee-herdsman:adapter:zStack:unpi:parser --> parsed 1 - 3 - 4 - 1 - [0] - 100 zigbee2mqtt_ch12 | 2023-01-16T03:32:04.447Z zigbee-herdsman:adapter:zStack:znp:SRSP <-- AF - dataRequest - {"status":0} zigbee2mqtt_ch12 | 2023-01-16T03:32:04.447Z zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext [] zigbee2mqtt_ch12 | 2023-01-16T03:32:04.455Z zigbee-herdsman:adapter:zStack:unpi:parser <-- [254,3,69,196,23,184,0,45] zigbee2mqtt_ch12 | 2023-01-16T03:32:04.455Z zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext [254,3,69,196,23,184,0,45] zigbee2mqtt_ch12 | 2023-01-16T03:32:04.455Z zigbee-herdsman:adapter:zStack:unpi:parser --> parsed 3 - 2 - 5 - 196 - [23,184,0] - 45 zigbee2mqtt_ch12 | 2023-01-16T03:32:04.455Z zigbee-herdsman:adapter:zStack:znp:AREQ <-- ZDO - srcRtgInd - {"dstaddr":47127,"relaycount":0,"relaylist":[]} zigbee2mqtt_ch12 | 2023-01-16T03:32:04.455Z zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext [] zigbee2mqtt_ch12 | 2023-01-16T03:32:04.468Z zigbee-herdsman:adapter:zStack:unpi:parser <-- [254,5,69,196,218,242,1,91,113,135,254,3,68,128,0,1,41,239] zigbee2mqtt_ch12 | 2023-01-16T03:32:04.468Z zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext [254,5,69,196,218,242,1,91,113,135,254,3,68,128,0,1,41,239] zigbee2mqtt_ch12 | 2023-01-16T03:32:04.468Z zigbee-herdsman:adapter:zStack:unpi:parser --> parsed 5 - 2 - 5 - 196 - [218,242,1,91,113] - 135 zigbee2mqtt_ch12 | 2023-01-16T03:32:04.468Z zigbee-herdsman:adapter:zStack:znp:AREQ <-- ZDO - srcRtgInd - {"dstaddr":62170,"relaycount":1,"relaylist":[29019]} zigbee2mqtt_ch12 | 2023-01-16T03:32:04.468Z zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext [254,3,68,128,0,1,41,239] zigbee2mqtt_ch12 | 2023-01-16T03:32:04.468Z zigbee-herdsman:adapter:zStack:unpi:parser --> parsed 3 - 2 - 4 - 128 - [0,1,41] - 239 zigbee2mqtt_ch12 | 2023-01-16T03:32:04.468Z zigbee-herdsman:adapter:zStack:znp:AREQ <-- AF - dataConfirm - {"status":0,"endpoint":1,"transid":41} zigbee2mqtt_ch12 | 2023-01-16T03:32:04.469Z zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext [] zigbee2mqtt_ch12 | 2023-01-16T03:32:04.485Z zigbee-herdsman:adapter:zStack:unpi:parser <-- [254,28,68,129,0,0,4,11,218,242,1,242,0,69,0,32,74,64,0,0,8,8,59,10,8,5,33,119,2,91,113,28,60] zigbee2mqtt_ch12 | 2023-01-16T03:32:04.485Z zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext [254,28,68,129,0,0,4,11,218,242,1,242,0,69,0,32,74,64,0,0,8,8,59,10,8,5,33,119,2,91,113,28,60] zigbee2mqtt_ch12 | 2023-01-16T03:32:04.486Z zigbee-herdsman:adapter:zStack:unpi:parser --> parsed 28 - 2 - 4 - 129 - [0,0,4,11,218,242,1,242,0,69,0,32,74,64,0,0,8,8,59,10,8,5,33,119,2,91,113,28] - 60

Resetting the device and trying to re-pair works for battery and temperature reporting but the device will not report open/close status. I suspect this is because IAS enrollment is not happening unless the device is interviewed.

I also do not see where herdsman is logging device announcements. I am still convinced that a device announcement is being missed when the device is changing addresses.

shackrat commented 1 year ago

Just had a non-Iris Samsung Gen 4 Outlet kind-of drop off the network. The device is responding to group messages, sending reports, but is marked offline and not controllable.

It's a very strong symptom of a missed Device Announce message with a new address. Resetting and re-pairing are the only way to get this plug reconnected.

zigbee2mqtt_ch12  | 2023-01-15T16:51:48.285Z zigbee-herdsman:controller:log Received 'zcl' data '{"frame":{"Header":{"frameControl":{"frameType":0,"manufacturerSpecific":false,"direction":1,"disableDefaultResponse":false,"reservedBits":0},"transactionSequenceNumber":10,"manufacturerCode":null,"commandIdentifier":10},"Payload":[{"attrId":1285,"dataType":33,"attrData":1207}],"Command":{"ID":10,"name":"report","parameters":[{"name":"attrId","type":33},{"name":"dataType","type":32},{"name":"attrData","type":1000}]}},"address":47127,"endpoint":1,"linkquality":98,"groupID":0,"wasBroadcast":false,"destinationEndpoint":242}'
zigbee2mqtt_ch12  | 2023-01-15T16:51:48.298Z zigbee-herdsman:controller:endpoint DefaultResponse 0x286d9700010422da/1 2820(10, {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":true,"direction":1,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false})
zigbee2mqtt_ch12  | 2023-01-15T16:51:48.299Z zigbee-herdsman:adapter:zStack:adapter sendZclFrameToEndpointInternal 0x286d9700010422da:47127/1 (0,0,2)
zigbee2mqtt_ch12  | 2023-01-15T16:51:48.299Z zigbee-herdsman:adapter:zStack:znp:SREQ --> AF - dataRequest - {"dstaddr":47127,"destendpoint":1,"srcendpoint":1,"clusterid":2820,"transid":182,"options":0,"radius":30,"len":5,"data":{"type":"Buffer","data":[24,10,11,10,0]}}
zigbee2mqtt_ch12  | 2023-01-15T16:51:48.299Z zigbee-herdsman:adapter:zStack:unpi:writer --> frame [254,15,36,1,23,184,1,1,4,11,182,0,30,5,24,10,11,10,0,52]
zigbee2mqtt_ch12  | 2023-01-15T16:51:48.300Z zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext []
zigbee2mqtt_ch12  | Zigbee2MQTT:info  2023-01-15 11:51:48: MQTT publish: topic 'zigbee2mqtt1/Office Christmas Tree', payload '{"current":0.01,"last_seen":"2023-01-15T11:51:48-05:00","linkquality":98,"power":0,"state":"OFF","voltage":120.7}'
zigbee2mqtt_ch12  | 2023-01-15T16:51:48.310Z zigbee-herdsman:adapter:zStack:unpi:parser <-- [254,1,100,1,0,100]
zigbee2mqtt_ch12  | 2023-01-15T16:51:48.310Z zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext [254,1,100,1,0,100]
zigbee2mqtt_ch12  | 2023-01-15T16:51:48.310Z zigbee-herdsman:adapter:zStack:unpi:parser --> parsed 1 - 3 - 4 - 1 - [0] - 100
zigbee2mqtt_ch12  | 2023-01-15T16:51:48.310Z zigbee-herdsman:adapter:zStack:znp:SRSP <-- AF - dataRequest - {"status":0}
zigbee2mqtt_ch12  | 2023-01-15T16:51:48.311Z zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext []
zigbee2mqtt_ch12  | 2023-01-15T16:51:48.323Z zigbee-herdsman:adapter:zStack:unpi:parser <-- [254,3,68,128,0,1,182,112]
zigbee2mqtt_ch12  | 2023-01-15T16:51:48.324Z zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext [254,3,68,128,0,1,182,112]
zigbee2mqtt_ch12  | 2023-01-15T16:51:48.324Z zigbee-herdsman:adapter:zStack:unpi:parser --> parsed 3 - 2 - 4 - 128 - [0,1,182] - 112
zigbee2mqtt_ch12  | 2023-01-15T16:51:48.324Z zigbee-herdsman:adapter:zStack:znp:AREQ <-- AF - dataConfirm - {"status":0,"endpoint":1,"transid":182}
zigbee2mqtt_ch12  | 2023-01-15T16:51:48.324Z zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext []
zigbee2mqtt_ch12  | 2023-01-15T16:51:48.532Z zigbee-herdsman:adapter:zStack:unpi:parser <-- [254,3,69,196,176,46,0,28]
zigbee2mqtt_ch12  | 2023-01-15T16:51:48.532Z zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext [254,3,69,196,176,46,0,28]
zigbee2mqtt_ch12  | 2023-01-15T16:51:48.532Z zigbee-herdsman:adapter:zStack:unpi:parser --> parsed 3 - 2 - 5 - 196 - [176,46,0] - 28
zigbee2mqtt_ch12  | 2023-01-15T16:51:48.533Z zigbee-herdsman:adapter:zStack:znp:AREQ <-- ZDO - srcRtgInd - {"dstaddr":11952,"relaycount":0,"relaylist":[]}
zigbee2mqtt_ch12  | 2023-01-15T16:51:48.533Z zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext []
zigbee2mqtt_ch12  | 2023-01-15T16:51:48.546Z zigbee-herdsman:adapter:zStack:unpi:parser <-- [254,5,69,196,149,194,1,176,46,76]
zigbee2mqtt_ch12  | 2023-01-15T16:51:48.546Z zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext [254,5,69,196,149,194,1,176,46,76]
zigbee2mqtt_ch12  | 2023-01-15T16:51:48.547Z zigbee-herdsman:adapter:zStack:unpi:parser --> parsed 5 - 2 - 5 - 196 - [149,194,1,176,46] - 76
zigbee2mqtt_ch12  | 2023-01-15T16:51:48.547Z zigbee-herdsman:adapter:zStack:znp:AREQ <-- ZDO - srcRtgInd - {"dstaddr":49813,"relaycount":1,"relaylist":[11952]}
zigbee2mqtt_ch12  | 2023-01-15T16:51:48.547Z zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext []
zigbee2mqtt_ch12  | 2023-01-15T16:51:48.582Z zigbee-herdsman:adapter:zStack:unpi:parser <-- [254,28,68,129,0,0,4,11,176,46,1,242,0,211,0,115,54,148,0,0,8,8,123,10,8,5,33,14,0,176,46,29,105]
zigbee2mqtt_ch12  | 2023-01-15T16:51:48.583Z zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext [254,28,68,129,0,0,4,11,176,46,1,242,0,211,0,115,54,148,0,0,8,8,123,10,8,5,33,14,0,176,46,29,105]
zigbee2mqtt_ch12  | 2023-01-15T16:51:48.583Z zigbee-herdsman:adapter:zStack:unpi:parser --> parsed 28 - 2 - 4 - 129 - [0,0,4,11,176,46,1,242,0,211,0,115,54,148,0,0,8,8,123,10,8,5,33,14,0,176,46,29] - 105
zigbee2mqtt_ch12  | 2023-01-15T16:51:48.584Z zigbee-herdsman:adapter:zStack:znp:AREQ <-- AF - incomingMsg - {"groupid":0,"clusterid":2820,"srcaddr":11952,"srcendpoint":1,"dstendpoint":242,"wasbroadcast":0,"linkquality":211,"securityuse":0,"timestamp":9713267,"transseqnumber":0,"len":8,"data":{"type":"Buffer","data":[8,123,10,8,5,33,14,0]}}
shackrat commented 1 year ago

This still continues to happen daily across both instances of Z2M.

Resetting and re-pairing the device does not work. The device never fully connects as IAS enrollment has not completed. IF the battery is pulled the sensor thinks it is not connected to a coordinator and goes into pairing mode. The only way to stop this behavior is to stop the Z2M instance, open database.db file and set interviewComplete: false, then restart Z2M, pull the battery from the device, and then it'll reconnect, complete the interview, and stay online.

shackrat commented 1 year ago

I have discovered another way to bring these devices online without re-pairing that works even for a device offline 24 hours.

In Dev Console, execute a read for the genPollControl cluster reading the checkInInterval value. After that the device goes back online and communicates with the coordinator. I would think the coordinator should be responsible for attempting to re-establish routes but that does not appear to be happening.

Koenkk commented 1 year ago

@shackrat would a fix be to read the checkininterval every 24 hours?

shackrat commented 1 year ago

That would work as a fix, but it feels like a band-aid. Problem is, I just cannot figure out why this is happening. Herdsman logs only show the device sending the final checkin, and then nothing after. I haven't spotted anything in the captures that shows anything else.

Here's a current one that's gone offline. I have the checkIn interval set to 13,200 which is 55 minutes. This means at least 6 check-in events have been missed.

image

Then I read checkIn interval.. And after about 7-8 seconds it's back on line and will stay for a day or two.

image

I can also read other attributes and "wake" the device. However, attempting to configure the device will fail beforehand with an unreachable error. After reading an attribute the configure command works.

The way this issue is manifesting itself is like the coordinator has lost the route to the device but attempting to read the attribute triggers a route discovery. I was not able to verify this in packet captures however. Could the coordinator be expiring routes too soon?

Koenkk commented 1 year ago

When "dropped off", does the sensor still send status updates? If not, this problem is not on the coordinator side (especially since it can reach the device when doing the read)

shackrat commented 1 year ago

The sensors (all of them Centralite 3320-L) stop sending updates. The herdsman logs show the checkIn event being received but the acknowledgement back to the device is what is timing out. Once a checkIn event times out, the device will no longer attempt to send to the coordinator any updates or reports. I believe this is something the device is doing to conserve battery power as it may be interpreting the lack of acknowledgement from the coordinator as the coordinator has gone away.

Reading an attribute does trigger a route discovery and then the device will continue to work.

Just to be clear, I'm not talking about one or two devices, there are 35 3320-L devices on the mesh, and most have done this at lest once of the past month. Here's the breakdown of the mesh..

Total 101
By device type
End devices: 67
Router: 34
By power source
Battery: 62
Mains (single phase): 36
unknown: 3
By vendor
CentraLite: 68
Samjin: 16
iMagic by GreatStar: 9
sengled: 5
Keen Home Inc: 1
SmartThings: 1
WAXMAN: 1
By model
3320-L: 33
3210-L: 23
outlet: 6
1116-S: 6
button: 6
3326-L: 5
E1C-NB7: 4
water: 3
1117-S: 3
3320: 2
3315: 1
4257050-RZHAC: 1
SV01-410-MP-1.0: 1
motion: 1
E11-G13: 1
3321-S: 1
motionv4: 1
3325-S: 1
leakSMART Water Valve v2.10: 1
3315-L: 1
Koenkk commented 1 year ago

Could you check if the issue is fixed with the following external converter (this disables the whole genpollctrl):

const fz = require('zigbee-herdsman-converters/converters/fromZigbee');
const tz = require('zigbee-herdsman-converters/converters/toZigbee');
const exposes = require('zigbee-herdsman-converters/lib/exposes');
const reporting = require('zigbee-herdsman-converters/lib/reporting');
const extend = require('zigbee-herdsman-converters/lib/extend');
const ota = require('zigbee-herdsman-converters/lib/ota');
const tuya = require('zigbee-herdsman-converters/lib/tuya');
const utils = require('zigbee-herdsman-converters/lib/utils');
const globalStore = require('zigbee-herdsman-converters/lib/store');
const e = exposes.presets;
const ea = exposes.access;

const definition = {
    zigbeeModel: ['3320-L'],
    model: '3320-L',
    vendor: 'Iris',
    description: 'Contact and temperature sensor',
    fromZigbee: [fz.ias_contact_alarm_1, fz.temperature, fz.battery],
    toZigbee: [],
    meta: {battery: {voltageToPercentage: '3V_2100'}},
    configure: async (device, coordinatorEndpoint, logger) => {
        const endpoint = device.getEndpoint(1);
        // Unbind genPollCtrl to prevent disconnects
        // https://github.com/Koenkk/zigbee2mqtt/issues/16183
        await device.getEndpoint(1).unbind('genPollCtrl', coordinatorEndpoint);
        await reporting.bind(endpoint, coordinatorEndpoint, ['msTemperatureMeasurement', 'genPowerCfg']);
        await reporting.temperature(endpoint);
        await reporting.batteryVoltage(endpoint);
    },
    exposes: [e.contact(), e.battery_low(), e.temperature(), e.battery()],
};

module.exports = definition;
shackrat commented 1 year ago

I'll give that a try, however I looked at the Arcus (formerly Iris) repository, and they used a checkIn interval of 2 1/2 minutes.. Not sure if that may be a factor.. They also bound the devices to that cluster.

This is from the Lowe's Iris repo.

Zigbee {
    offlineTimeout 10, MINUTES

    /////////////////////////////////////////////////////////////////////////////
    // Hub Local Lifecycle
    /////////////////////////////////////////////////////////////////////////////

    poll reflex {
        on added

        bind endpoint: 1, profile: 0x0104, cluster: Zcl.Power.CLUSTER_ID, server: true
        bind endpoint: 1, profile: 0x0104, cluster: Zcl.IasZone.CLUSTER_ID, server: true
        bind endpoint: 1, profile: 0x0104, cluster: Zcl.PollControl.CLUSTER_ID, server: true
        bind endpoint: 1, profile: 0x0104, cluster: Zcl.TemperatureMeasurement.CLUSTER_ID, server: true
        bind endpoint: 1, profile: 0x0104, cluster: Zcl.Diagnostics.CLUSTER_ID, server: true

        iaszone enroll
    }

    poll reflex {
        on connected
        ordered {
            read endpoint: 1, cluster: Zcl.IasZone.CLUSTER_ID, attr: Zcl.IasZone.ATTR_ZONE_STATUS
            read endpoint: 1, cluster: Zcl.Power.CLUSTER_ID, attr: Zcl.Power.ATTR_BATTERY_VOLTAGE
            read endpoint: 1, cluster: Zcl.TemperatureMeasurement.CLUSTER_ID, attr: Zcl.TemperatureMeasurement.ATTR_MEASURED_VALUE
            read endpoint: 1, cluster: Zcl.Diagnostics.CLUSTER_ID, attr: Zcl.Diagnostics.ATTR_LAST_MESSAGE_LQI

            // configure battery level reporting at most once an hour, at least once every 12 hours
            report endpoint: 1, cluster: Zcl.Power.CLUSTER_ID, attr: pwrCluster.ATTR_BATTERY_VOLTAGE, type: Data.TYPE_UNSIGNED_8BIT, min: 3600, max: 43200

            // configure temperature reporting at most once every 5 minutes, at least once every 30 minutes
            report endpoint: 1, cluster: Zcl.TemperatureMeasurement.CLUSTER_ID, attr: tempCluster.ATTR_MEASURED_VALUE, type: Data.TYPE_SIGNED_16BIT, min: 300, max: 1800

            send zcl.pollcontrol.setLongPollInterval, newLongPollInterval: 24
            send zcl.pollcontrol.setShortPollInterval, newShortPollInterval: 4

            // Set Poll Control Check-In interval to 2 minutes (480 1/4 seconds)
            write endpoint: 1, cluster: Zcl.PollControl.CLUSTER_ID, attr:Zcl.PollControl.ATTR_CHECKIN_INTERVAL, value: Data.encode32BitUnsigned(480)
        }
        delay {
            after 15, SECONDS
            read endpoint: 1, cluster: Zcl.Power.CLUSTER_ID, attr: Zcl.Power.ATTR_BATTERY_VOLTAGE
        }
    }

I modeled a converter based off of those specs to see if that would help. It did not.

// Based on the Arcus (Iris) Smart Home Driver
// https://github.com/arcus-smart-home/arcusplatform/blob/a02ad0e9274896806b7d0108ee3644396f3780ad/platform/arcus-containers/driver-services/src/main/resources/ZB_CentraLite_ContactSensor_2_3.driver
const exposes = require('zigbee-herdsman-converters/lib/exposes');
const fz = {...require('zigbee-herdsman-converters/converters/fromZigbee'), legacy: require('zigbee-herdsman-converters/lib/legacy').fromZigbee};
const tz = require('zigbee-herdsman-converters/converters/toZigbee');
const reporting = require('zigbee-herdsman-converters/lib/reporting');
const extend = require('zigbee-herdsman-converters/lib/extend');
const e = exposes.presets;
const ea = exposes.access;
const constants = require('zigbee-herdsman-converters/lib/constants');

//fz.ias_contact_alarm_1_report
module.exports = [
  {
      zigbeeModel: ['3320-L'],
      model: '3320-L',
      vendor: 'Iris v2',
      description: 'Contact and temperature sensor',
      fromZigbee: [fz.identify, fz.ias_contact_alarm_1, fz.ias_enroll, fz.temperature, fz.battery],
      toZigbee: [],
      meta: {battery: {voltageToPercentage: '3V_2100'}},
      configure: async (device, coordinatorEndpoint, logger) => {
          const endpoint = device.getEndpoint(1);
          await reporting.bind(endpoint, coordinatorEndpoint, ['genIdentify', 'genPowerCfg', 'genPollCtrl', 'msTemperatureMeasurement', 'ssIasZone', 'haDiagnostic']);
          await reporting.temperature(endpoint);
          await reporting.batteryVoltage(endpoint);

          // Set Poll Control Check-In interval to 2 minutes (480 1/4 seconds) (FROM the Iris Reflex Driver)
          const interval = 55 * 60; // Match checkin to ZHAs 13,200 (55min)
          await endpoint.write('genPollCtrl', {'checkinInterval': (interval * 4)});
      },
      exposes: [e.contact(), e.battery_low(), e.temperature(), e.battery()],
  },
];

I don't think unbinding the cluster is probably a good idea.

I found a log snippet from a couple days ago when the last 3320 went unresponsive. My logger crashed so I don't have anything after the 7th. It shows that multiple check-in messages came in (not surprising with the number of routers I have), and multiple responses were sent over the course of a few seconds.

It's almost like the return routes are lost, however it's only affecting this class of device.

zigbee2mqtt_ch12  | 2023-02-07T20:30:58.941Z zigbee-herdsman:controller:device:log check-in from 0x000d6f000b1154f4: declining fast-poll
zigbee2mqtt_ch12  | 2023-02-07T20:30:58.942Z zigbee-herdsman:controller:endpoint Command 0x000d6f000b1154f4/1 genPollCtrl.checkinRsp({"startFastPolling":false,"fastPollTimeout":0}, {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false})
zigbee2mqtt_ch12  | 2023-02-07T20:30:58.942Z zigbee-herdsman:adapter:zStack:adapter sendZclFrameToEndpointInternal 0x000d6f000b1154f4:3330/1 (0,0,1)
zigbee2mqtt_ch12  | 2023-02-07T20:31:03.962Z zigbee-herdsman:controller:device:log check-in from 0x000d6f000b1154f4: declining fast-poll
zigbee2mqtt_ch12  | 2023-02-07T20:31:03.962Z zigbee-herdsman:controller:endpoint Command 0x000d6f000b1154f4/1 genPollCtrl.checkinRsp({"startFastPolling":false,"fastPollTimeout":0}, {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false})
zigbee2mqtt_ch12  | 2023-02-07T20:31:06.985Z zigbee-herdsman:controller:device:log check-in from 0x000d6f000b1154f4: declining fast-poll
zigbee2mqtt_ch12  | 2023-02-07T20:31:06.985Z zigbee-herdsman:controller:endpoint Command 0x000d6f000b1154f4/1 genPollCtrl.checkinRsp({"startFastPolling":false,"fastPollTimeout":0}, {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false})
zigbee2mqtt_ch12  | 2023-02-07T20:31:08.958Z zigbee-herdsman:adapter:zStack:adapter Response timeout (0x000d6f000b1154f4:3330,0)
zigbee2mqtt_ch12  | 2023-02-07T20:31:11.969Z zigbee-herdsman:adapter:zStack:adapter sendZclFrameToEndpointInternal 0x000d6f000b1154f4:3330/1 (1,0,3)
zigbee2mqtt_ch12  | 2023-02-07T20:31:21.988Z zigbee-herdsman:adapter:zStack:adapter Response timeout (0x000d6f000b1154f4:3330,1)
zigbee2mqtt_ch12  | 2023-02-07T20:31:21.989Z zigbee-herdsman:adapter:zStack:adapter sendZclFrameToEndpointInternal 0x000d6f000b1154f4:3330/1 (0,0,2)
zigbee2mqtt_ch12  | 2023-02-07T20:31:21.994Z zigbee-herdsman:controller:endpoint Command 0x000d6f000b1154f4/1 genPollCtrl.checkinRsp({"startFastPolling":false,"fastPollTimeout":0}, {"sendWhen":"immediate","timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (Timeout - 3330 - 1 - 14 - 32 - 11 after 10000ms)
zigbee2mqtt_ch12  | 2023-02-07T20:31:21.995Z zigbee-herdsman:controller:device:error Handling of poll check-in form 0x000d6f000b1154f4 failed
github-actions[bot] commented 1 year ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days

maxill1 commented 1 year ago

I have discovered another way to bring these devices online without re-pairing that works even for a device offline 24 hours.

In Dev Console, execute a read for the genPollControl cluster reading the checkInInterval value. After that the device goes back online and communicates with the coordinator. I would think the coordinator should be responsible for attempting to re-establish routes but that does not appear to be happening.

I have similar disconnection issues with 6-7 Adeo LDSENK08 door sensors. On web interface I executed a read on group genPollCtrl with attribute checkInInterval for every single disconnected device and the sensors went back online.

Is there a way to trigger this command via MQTT as a temporary fix? Can't find the right topic on Docs

edit: zigbee2mqtt/device name/1/set={"read":{"attributes":["checkinInterval"],"cluster":"genPollCtrl","options":{}}}