fairecasoimeme / ZiGate

Zigate is an Universal Zigbee Gateway
http://zigate.fr
171 stars 59 forks source link

Loosing devices #38

Closed handfreezer closed 3 years ago

handfreezer commented 6 years ago

Hello,

Since I hadded more than about 25 Xiaomi devices (mix of temperature, motion, buttons, water), pairing between zigate and devices was lost twice for most devices. Common point on this two time is the fact that I remove zigate usb key without shutting down my domotic daemon, and without stopping my pc before.

Do you already know this starnge effect or not?

Best regards

handfreezer commented 6 years ago

https://github.com/KiwiHC16/Abeille/issues/240

handfreezer commented 6 years ago

Based on ticket KiwiHC16/Abeille#240 , it seems that a node not seen from a long time receive a "Leave" from zigate on new message, and then accept the other messages if node continues to send data (if node ignore the "Leave" command). Does this "Leave" order is a normal behaviour of ZigBee?

max5962 commented 6 years ago

Je continu à perdre des modules en route, comme toi. Je pense également à la même rootCause. Est-il possible d'éviter ce comportement ? Merci :)

handfreezer commented 6 years ago

Well... not for now as I know,and I have no answer from "fairecasoimeme" . I'm a quiet bit surprise as there isn't a lot of person loosing nodes, but they are more and more.

I hope to get some attention , but when?

intra-au commented 6 years ago

I don’t have zigate as I only have different chips, but I’ve been observing this behaviour with Xiaomi devices previously. When the coordinator is losing power and the information about currently logged in devices lost - coordinator is issuing leave command. In theory the device should attempt to re-join the network using known network key, however it looks like Xiaomi works slightly different. I’ll try to get back after more experiments with this.

max5962 commented 6 years ago

@fairecasoimeme Si par hasard, la théorie de @intra-au ( suite à un reboot du système par exemple ) s'avère vrai ( comportements différents pour certains modules xiaomi) peux-ton corriger cela via un update de firmware ?

fairecasoimeme commented 6 years ago

Ok, je vais regarder ça et essayer de reproduire le problème. Je me remet rapidement au développement du firmware 3.0e que j'ai dû mettre en pause. J'ai promis de sortir le firmware courant Juin. Je reprends tous les problèmes un par un et j'essaie de corriger.

handfreezer commented 6 years ago

Hi, I lost devices at least three times from beginning, and for one of them, I'm sure that I lost devices after a reboot of the PC (and so I suppose the Zigate too, I'm using a z8350 smallbox), for the two others, I did not check if I lost power.

fairecasoimeme commented 6 years ago

Ok ... and this kind of problem is only with xiaomi sensors. Anyone tried with some other devices like philips, osram and noticed the bug ?

Num34 commented 6 years ago

Salut, pour ma part, j'ai pu activer ma prise commandée OSRAM, puis impossible de l'éteindre....la trame 8015 était sans appareils. j'ai donc perdu les équipements après l'émission de la trame ON ou OFF OSRAM. j'ai tout de même déjà perdu des équipements xiaomi.(j'ai l'impression que si l'équipement ne réalise pas de transmission vers zigate au bout d'un moment ils disparaissent....).

Neonox31 commented 6 years ago

Hi @fairecasoimeme, First thanks for your awesome job on ZiGate ! I have exactly the same behavior, my ZiGate loses my devices after a couple of time of inactivity. For the moment, I only have two xiaomi aquara door sensors.

I hope the 3.0e firmware will be out soon and maybe fix the problem :crossed_fingers:

fairecasoimeme commented 6 years ago

Hi all, I discovered informations about this issue. I confirm that this issue concern Xiaomi devices. Indeed, Xiaomi device speak rarely ... less than other zigbee devices.

The zigbee specification includes a section called End Device Aging. It states that a parent must forget about a child that doesn’t check in for a certain amount of time. In other words if the parent (like ZiGate or another router on the network) hasn’t heard from a device within a certain amount of time it must tell the device to leave and rejoin the network. This isn’t an uncommon thing to happens on a zigbee network and most devices will handle this just fine. When the device does it right you’ll never even notice it was gone for a brief period.

By using a zigbee sniffer I can see all the communication between the sensor and the ZiGate. The sensor is quiet for about an hour and then sends a checkin message to the ZiGate. The timeout for end devices has elapsed so the hub’s zigbee radio doesn’t recognize the sensor anymore. Therefore per the specification it sends the sensor a leave and rejoin request. The sensor replies with a message saying it is going to leave and rejoin. Then it does leave but it does not attempt to rejoin. Instead it appears to factory reset itself. This is why it drops offline and I believe is non-compliant behavior.

Logically, I can't change Xiaomi device firmware but I'll try to increase the ZiGate timeout to avoid to Xiaomi device to receive the "LeaveRejoin" command and automatically avoid the factory reset.

I can't test quickly the benefit because the issue is randomly but I'm counting on you to give returns when the 3.0e version will be avalaible ;)

Fred

KiwiHC16 commented 6 years ago

Hi Fred,

I saw exactly what you are explaining but I have a "but".

We need to check the timer before any modification. My observation are:

So expending the timer of the zigate will not solve (or partially) the issue. I didn't have enough time to dig into zigate software to find the timer value.

Xiaomi send every 50 min a "data request". When I'll time I'll check if they all behave in the same way. Let's compare...

KiwiHC16

PS: I know that some people experience a loose even when the equipment is very close to the zigate which looks to be a different topic. Or then have xiaomi equipement with very long poll timers, or ...

max5962 commented 6 years ago

Are you saying that xiaomi software poll theirs devices in order to prevent that behavior ?

KiwiHC16 commented 6 years ago

My feeling but need to be verified. Xiaomi GW never "forget" an equipment while it should... I'll switch on my Xiaomi GW to see if I can find something.

max5962 commented 6 years ago

By the way, some software (jeedom i think) already poll zigbee device in order to get battery information every 50 minutes.

KiwiHC16 commented 6 years ago

In jeedom you have 2 plugins: Abeille et Zigate. I don't know Zigate plugin but I develop Abeille plugin. Abeille don't poll the equipment. It's the equipement which wakes up every 50 min and request info (you can't see it from Zigate). For the information like temp/humidity/... i don't understand the logic which trigger the equipement to send the information. I have rigth now 10 xiaomi equipement on a test system and I'm logging to see the behavior for each sort. Let see the result.

KiwiHC16 commented 6 years ago

After 12 hours of monitoring I can confirm that Xiaomi Equipement send a data request every hour at least except one the door ssensor V1.

handfreezer commented 6 years ago

Hello, I'm back as I registered back all my lost devices on June 30th, and I lost 4 new devices during last 7 days. Below a screenshot of "Sante" from jeedom/Abeille plugin and the result of getdevice list (it is missing devices from 13 to 15.

I will register them back and take a new devicelist. Is it possible, on a leave/rejoin order from zigate, that zigate remove fully the device from it device list?

ID: 0 Short Address: 0x4D6C ID: 1 Short Address: 0x80A4 ID: 2 Short Address: 0x4D35 ID: 3 Short Address: 0x0B69 ID: 4 Short Address: 0x12D8 ID: 5 Short Address: 0x2CFD ID: 6 Short Address: 0x1C3A ID: 7 Short Address: 0x49B8 ID: 8 Short Address: 0xE888 ID: 9 Short Address: 0xCE8B ID: 10 Short Address: 0x395B ID: 11 Short Address: 0xC0B3 ID: 12 Short Address: 0x2498 ID: 16 Short Address: 0x23B8

image

handfreezer commented 6 years ago

I'm back twice a day... strange but I kick a new point: I registered a lost device, and it registered as 0xcfb8. I unplugged my zigate from jeedom box, plug it into my desktop and get device list : 0xcfb8 was not in the list. I plugged back to my jeedom box the zigate, I hit the 0xcfb8 (xiaomi temp sensor 2, squarre one), blue led blinked, I didi it few times. I unplgged zigate again, plugged it back to my desktop, and now I have a 13th device in getdevicelist with short address 0xcfb8.

Sounds normal?

handfreezer commented 6 years ago

Note : on my test environment (including more than 20 devices) I had no lost for now. As I have two zigate, I 'm wondering if the one for real use is not deficient on memory? it could expain why I have problem on getlist device from memory? no?

Neonox31 commented 6 years ago

Hello,

Any news on this issue ? I am still losing my xiaomi door sensors when they do not have any activity during approximately two days, this is really annoying for me. Do you have an idea when the ZiGate code will be open sourced to help for a patch ?

Thanks a lot.

handfreezer commented 6 years ago

Should be opened un June 2018 with version 3.0e but no news from this time. Hope Akila is going well.

KiwiHC16 commented 6 years ago

No worry, he is going well but a lot to do. We did some test offline and things are progressing.

handfreezer commented 6 years ago

After 4weeks of no probs, I lost 4devices xiaomi. When I press the join button of devices, they tried to do a rejoin but without inclusion started it fails (normal?).

Would you plan an 'auto-accept' mode from devices already paired when rejoin is initiated?

doudz commented 6 years ago

For information, I had no problem for months with 10 XIAOMI devices, but now I have :

max5962 commented 6 years ago

Same issues as @doudz. Thanks KiwiHc16 for the ETA : ) If akila needs help... We could help : )

doudz commented 6 years ago

A new sensor leaved today, the common points between the other sensor are :

Leaving seems happen when device failed to send the battery level for 1 day (or more)

Edit : About the device missing from zigate device list, it is far away from zigate too, last RSSI is 69. But it's still sending values

pipiche38 commented 6 years ago

Side comment, as I just received a Xiaomi Aqara cube 2 days ago and I see a strange behaviour which is every time I try to pair with Zigate, because I removed it from Domoticz it comes with a new Short Address (which is expected), but the IEEE seems to change as well !

I'll further investigate next week, but might be worst to check on your side if that is the same behaviour :-(

handfreezer commented 6 years ago

Hello,

This week, I lost again 4devices (2 xiaomi square temperature, and 2 xiaomi round temperature). As I read that next version of zigate will contain "patch" about leave of xiaomi devices, and is it possible to make available a 3.0e-RC1 as a release candidate or a 3.0e-alpha01 version of the next firmware?

Note : it is not the same sensors as 28days ago.

Best regards.

tornlouses commented 6 years ago

Hello, I also have the same problem. since firmware 3.0d, i see the problem when i had more than 16 xiaomi device (i just have xiaomi device : price!). For me the problem is for the devices :

Approximately i have no problem during 1 month with the 16 first devices synchronized (8 temperature, 4 motion, 4 flooding).

I hope the V3.0e will change this :) Fred, it will be necesary to erase EEPROM or PDM with the new firmware?

handfreezer commented 5 years ago

lost 2 devices. Thanks to Abeille in jeedom to resync short address based on IEEE address.

pipiche38 commented 5 years ago

@handfreezer what do you mean ,

to resync short address based on IEEE address.

ricky074 commented 5 years ago

Hello, same problem. I lose often my xiaomi switch... Litle bite enoying...

KiwiHC16 commented 5 years ago

@handfreezer what do you mean ,

to resync short address based on IEEE address.

He means that now Jeedom/Abeille is able to recognise the equipement and does the rigth adaptation in Jeedom when an equipment change it's address. As a result it's transparent to the user.

KiwiHC16 commented 5 years ago

Hello, same problem. I lose often my xiaomi switch... Litle bite enoying...

Are you using Jeedom/Abeille ? If yes, could you run a Network Graph to have some technical info to try to understand what's going on.

ricky074 commented 5 years ago

I use domoticz...

pipiche38 commented 5 years ago

He means that now Jeedom/Abeille is able to recognise the equipement and does the rigth adaptation in Jeedom when an equipment change it's address. As a result it's transparent to the user.

@KiwiHC16 Do you mean that the device change the Short address (even not in a pairing mode) ? I was under the impression that it occurs only when pairing

pipiche38 commented 5 years ago

@ricky074 in the Domoticz Zigate Plugin on the branch 'dev' we are now doing such adaptation as well

KiwiHC16 commented 5 years ago

He means that now Jeedom/Abeille is able to recognise the equipement and does the rigth adaptation in Jeedom when an equipment change it's address. As a result it's transparent to the user.

@KiwiHC16 Do you mean that the device change the Short address (even not in a pairing mode) ? I was under the impression that it occurs only when pairing

I'm nice with competitors, just joking: Short address can change anytime. Many reason for it.

pipiche38 commented 5 years ago

fair enough!

pipiche38 commented 5 years ago

@max5962

By the way, some software (jeedom i think) already poll zigbee device in order to get battery information every 50 minutes.

Could you let us know how the polling is done. Is based on sending a Command to the device ? Because in that case, don't we have the risk to drain the battery down

pipiche38 commented 5 years ago

any update when 3.0e will be released ?

ricky074 commented 5 years ago

I made satistics on communication with xiaomi devices (door sensors). I confirm that I have regulary some communication black out, during many hours (I cheched the battery level emmission from the device). I note this problem only on distant sensors with low RSSI (arround 50).

Today, I can't be confident wtih those devices.It's e little bit anoying... I can made tests If you need... Hope the 3.0e will solve that...

pipiche38 commented 5 years ago

@KiwiHC16 I'm nice with competitors, just joking: Short address can change anytime. Many reason for it.

I don't see competition here . Nothing to proof, nothing to win ...

Gabvoir commented 5 years ago

Bonjour, Après mise à jour en 3.0e, toujours des pertes de périphériques Xiaomi. (2 capteurs de température Aqara). Mon Zigate est branché sur une eedomus.

daoney29 commented 5 years ago

Hello,

Ma Zigate est en version 3.0e et est branchée sur une eedomus+. Je dispose de divers périphériques Aqara Xiaomi (capteurs de température v2, ouverture, inondation, etc). Je suis en phase d'installation des capteurs Xiaomi et personnellement, je n'ai rencontré aucun problème d'inclusion sur aucun capteur et ils fonctionnent tous sans problème en situation normale depuis 10 jours.

Sauf qu'à 2 reprises, après une coupure d'alimentation sur ma box eedomus (et donc aussi sur la Zigate), certains périphériques Aqara (température, ouverture) ont cessé de communiquer avec la box, mais pas tous :

Chose importante à signaler, après les 2 coupures d'alimentation, ce sont EXACTEMENT les mêmes périphériques qui sont KO et les mêmes qui sont OK.

Ca parle à quelqu'un ?

ricky074 commented 5 years ago

Oui j'ai exactement le même soucis. Moi je suis sur un raspberry.

KiwiHC16 commented 5 years ago

@daoney29 & @ricky074, est ce que vous avez des équipements sur le 220V (routeurs) dans votre réseau zigbee, par exemple ampoules, prises,... ?

daoney29 commented 5 years ago

Salut KiwiHC16, J'ai effectivement des ampoules Hue sur le réseau zigbee de la Zigate. En ce qui les concerne, elles ne sont pas impactées par les coupures de courant sur la box eedomus et continuent à fonctionner normalement après. Mes périphériques Legrand with Netatmo en zigbee ne sont pas sur le même réseau, ils sont restés dans l'écosystème Legrand.