webdeck / homebridge-indigo

Homebridge Plugin for Indigo
Apache License 2.0
13 stars 6 forks source link

All Accessories Lost Upon Homebridge Restart & Indigo Connection Error #22

Open KardelSniper opened 7 years ago

KardelSniper commented 7 years ago

My homebridge randomly restarted, and upon restart, its log showed Indigo connection error: ERRCONNREFUSED. This doesn't usually happen, but once happened, homebridge would log "created 0 accessories", and thus HomeKit will delete everything from homebridge, and in turn all room configurations etc get lost.

Suggest in the function of sending "devices.json" request, upong asyncError, instead of only logging the error, but also try re-send the request, and/or (upon repeated failing) restart homebridge or homebridge-indigo.

KardelSniper commented 7 years ago

This is what I just tried putting in the assync final callback function. Not tested yet. It simply executes an sh file to restart homebridge upon asyncError. But I also put the codes that execute when there's no asyncError into an else branch, thinking that if this part is not run when there's an error, HomeKit won't go and delete the accessories.

function (asyncError) {
            if (asyncError) {
                this.log(asyncError);
                this.log("Restarting homebridge");
                var exec = require('child_process').exec;
                var cmd = 'sh THIS_FILE_USES_LAUNCHCTL_TO_RESTART_HOMEBRIDGE';
                exec(cmd, function(error, stdout, stderr){
                    if(error)
                    {
                        this.log("Restart homebridge failed: " + stderr);
                        process.exit(999);
                    }
                    else
                    {
                        this.log("Restart homebridge: " + stdout);
                    }
                });
            }
            else
            {
                if (this.foundAccessories.length > 99) {
                    this.log("*** WARNING *** you have %s accessories.",
                             this.foundAccessories.length);
                    this.log("*** Limiting to the first 99 discovered. ***");
                    this.log("*** See README.md for how to filter your list. ***");
                    this.foundAccessories = this.foundAccessories.slice(0, 99);
                }

                this.log("Created %s accessories", this.foundAccessories.length);
                callback(this.foundAccessories.sort(
                    function (a, b) {
                        return (a.name > b.name) - (a.name < b.name);
                    }
                ));
            }

        }.bind(this)
webdeck commented 7 years ago

The better solution is that I need to update to version 2.0 of the plugin API that supports cached devices that are not reachable.

KardelSniper commented 7 years ago

That'll be really awesome! Looking forward to see the update!

dscottbuch commented 7 years ago

I'm having a related, but different issue. I've got things configured so home bridge is running under launchctl, as is Indigo server. Everything works great with restarting on reboot, keeping home bridge up and running, etc. Through any reboot etc. the HomeKit database is maintained and scenes and room associations are fine BUT if I do a system OS upgrade then all the associated data goes missing, Room assignments, scenes, all the accessories are no longer associated with automations, etc.

I've verified that the UUIDs assigned by Homebridge are properly maintained, they don't change, BUT the uniqueIdentifiers in the HomeKit do change when this happen. Weirdly, not for every accessory but for most.

I can't find decent documentation as to how these are assigned and maintained by HomeKit so I'm looking for help in that area. I know that the uniqueIdentifier are different from IOS device to IOS device so I'm caching them on each device for comparison, which is how I know they're changing.

Thanks in advance for any help. Scott

webdeck commented 7 years ago

When you say do a "system OS upgrade" do you mean on your iOS device, on the Indigo server, or on the server running homebridge?

You may have better luck asking nfarina directly on the homebridge repository: https://github.com/nfarina/homebridge

dscottbuch commented 7 years ago

macOS upgrade of the server running indigo and homebridge.

webdeck commented 7 years ago

Did the .homebridge/accessories and .homebridge/persist directories get deleted in the upgrade? That's the only reason I could think of a macOS upgrade causing a problem.

dscottbuch commented 7 years ago

Hmmm. I don't think so because the UUIDs of each device assigned by homebridge didn't change.

webdeck commented 7 years ago

The UUIDs are generated based on the device's name, so they wouldn't change regardless.

dscottbuch commented 7 years ago

Looking at the code I thought that was only true if it was a previously used device name (which seems to be the Indigo device ID). I thought those UUIDs for previously seen names were stored in the persist.

In any case the two files in the persist directory seem to change file dates often so it's hard to track what they are doing during an OS update. Note that I have the .homebridge directory in /etc/Homebridge as I run this from launchctl as root. Other than this issue above its works great. If I have a problem I can restart homebridge from Indigo using a 'killall homebridge' script in an action group.

In the AccessoryInfo.CD223DE3CE30.json file there are a number of entries. Is it one of these that changes that causes HomeKit to think everything is new again? Note I don't lose the devices themselves, they are all there, just any associations I've created in the Home app such as room assignments, scenes, and associations with automations. The automations are still there they just no longer have any device associated with them and are therefore disabled.

{"displayName":"Buckhaven","category":2,"pincode":"031-55-155","signSk":"e7501f358c3319c14d66cc4dfd71d6f004365287f724237ce31ca1af8ab3ece3fc00b1962e10535dd4adf6b249b6703cd4149cd912943d026d40b6d75fc2405b","signPk":"fc00b1962e10535dd4adf6b249b6703cd4149cd912943d026d40b6d75fc2405b","pairedClients":{"363BCE5D-1C57-424D-A69E-D3DC6C9B72A1":"cb69735ba18c014338e65f522dd1ddd351f0c6c2ce2b78603a88a693616ef895","0C261999-146A-459B-BF0E-DCDAA988F08D":"a50254f35a9df31984d6e0569d4101a6240520271609209bc3726dd6a23f65af"},"configVersion":26,"configHash":"f57f07b84641c194e9cf8f1f6814d59aebc7ef57","relayEnabled":false,"relayState":2,"relayAccessoryID":"","relayAdminID":"","relayPairedControllers":{},"accessoryBagURL":""}

webdeck commented 7 years ago

Sorry, I misspoke - the UUIDs are generated based on the Indigo Device ID, not the device name, and they are constant as long as the Indigo Device ID doesn't change.

I can't speak to what's going on with the internals of homebridge and its persistence, which is why I pointed you to the main homebridge repository to ask your question.

Are you sure this wasn't another case of homebridge starting up before the Indigo RESTful API was available? The simplest solution is to delay your startup by either sleeping, or doing curls against the Indigo RESTful API until its successful (which is what I do with my install.)

dscottbuch commented 7 years ago

Thanks. Re homebridge startup it does start on boot but then Indigo kills it 2 minuted after Indigo starts so it re-establishes connection after the API is available. This way I don't have any scripts, etc., just Indigo. Both Indigo Server and homebridge are started, and maintained by launchctl this way and all seems to work well on reboot.

I'll go over to homebridge and see what I can find.

webdeck commented 7 years ago

Another option is to put the launchctl plist outside of LaunchDaemons and have an Indigo script invoke launchctl load to start it up - that way you don't need to do the kill...

MattTimmons commented 7 years ago

I added a sleep to the beginning of the homebridge script startup.. seems to help. But not always.

var sleep = require('sleep'); sleep.sleep(30)

Colorado4Wheeler commented 7 years ago

I don't think there is a lot you can do about this. I battled this issue for a while and lost my configuration many times and I believe that the problem is that if you try to access Homebridge while it's down that is when it happens. The only confirmation I have for this is that since believing that I go nowhere NEAR any HomeKit access while HB is restarting and I have yet to lose my config since and before that I lost my config at least 15+ times. That's over 6 months ago now.

dscottbuch commented 7 years ago

I may have found an underlying reason for this happening. On upgrading to iOS 11 on my iPad I found that one device was consistently being removed from the home. I would add it back and then a short while later they would be gone. I would be adding it back on iOS 10.

dscottbuch commented 7 years ago

Sorry about that I hit comment before finishing. Then I noticed that in iOS 11 that device appeared as an accessory, a home, and it was marked as not allowed. It seems that the name is a reserved name. I had named my front door as "Entry". Since changing the name to "Front Entry" everything seems to be working. It hasn't been that long so I'll let you know if it fails again.

dscottbuch commented 7 years ago

Well, just updated to the latest Sierra Beta and, once again, all the rom and automation connections vanished. Does anyone have a hint as to where in the code I can/should look? Very frustrating.

dscottbuch commented 7 years ago

While this is probably a homebridge problem I'm posting here because I'm getting no response over there at the moment. I'm just looking for some hint as to where to look as I'm still not entirely clear on the path through the code for accessory creation and identification. This concerns losing the HomeKit identities of accessories on updating OS X (or macOS now). Reboots work fine but system updates create all sorts of havoc. All accessories are still shown by their HomeKit ID's have changed so that Automations and Rooms and Scene associations are lost. Below is some of the persistence data I recorded over the update but I don't know where the changes in the results occurred.

The other clue is if I put the old persist direction in, from a backup, after stopping homebridge and then restarted the accessory names were there but they controlled other Indigo devices, The mapping from the name displayed in HomeKit to Indigo was scrambled.

Just went through an update to the latest 12.6 beta - 10.12.6 Beta (16G23a) and had the problem. I've compared the UUID's assigned by homebridge-indigo before and after - they are Identical. I've compared the Accessories and Identities persist files before and after and they are NOT QUITE identical.

In the Accessories there are two differences.

<     [configVersion] => 13
<     [configHash] => 4a142ddf95a2f0b329242e491d24f8a94e1f9089
---
>     [configVersion] => 11
>     [configHash] => 6a2cb3b957ed58976cd3e369eca5e785a4308b09

where left is after and right is before.

In the Identities all of the 'Top Level' Key-Value pair for each accessory has changed the value, not the key.

15,16c15,16
<             [|nextAID] => 31
<             [148089d9-7298-478c-8149-bcf4ff55ef65] => 2
---
>             [|nextAID] => 33
>             [148089d9-7298-478c-8149-bcf4ff55ef65] => 30
31c31
<             [e5f476ab-ed57-49e0-9d03-a71a5611d764] => 3
---
>             [e5f476ab-ed57-49e0-9d03-a71a5611d764] => 29
47c47
<             [ca3ba660-f9b5-4f2d-b6e9-5169a039d2b1] => 4
---
>             [ca3ba660-f9b5-4f2d-b6e9-5169a039d2b1] => 2
62c62
<             [6ce53d5e-1388-42d0-a05d-99f1ec3012ab] => 5
---
>             [6ce53d5e-1388-42d0-a05d-99f1ec3012ab] => 3
77c77
<             [c1665f28-b87f-4283-9079-01fefc18b879] => 6
---
>             [c1665f28-b87f-4283-9079-01fefc18b879] => 4
92c92
<             [73662ca6-8974-4763-a93e-b6c727f8d9ba] => 7
---
>             [73662ca6-8974-4763-a93e-b6c727f8d9ba] => 5
107c107
<             [d907785c-3a00-4811-ab9a-093f1073bb33] => 8
---
>             [d907785c-3a00-4811-ab9a-093f1073bb33] => 6
122c122
<             [5ce3b746-a1a8-4893-9b50-c280f776a5bc] => 9
---
>             [5ce3b746-a1a8-4893-9b50-c280f776a5bc] => 31
137c137
<             [b3286ca0-a804-475c-b7d8-c8f7cf170d33] => 10
---
>             [b3286ca0-a804-475c-b7d8-c8f7cf170d33] => 8
152c152
<             [78cdde06-23de-4f1d-bba0-cfcc13acd35b] => 11
---
>             [78cdde06-23de-4f1d-bba0-cfcc13acd35b] => 28
168c168
<             [f0373057-c86b-4406-a85a-f25ce252468d] => 12
---
>             [f0373057-c86b-4406-a85a-f25ce252468d] => 9
183c183
<             [45c45507-e87f-4a9d-8d15-841ba2dc537d] => 13
---
>             [45c45507-e87f-4a9d-8d15-841ba2dc537d] => 10
199c199
<             [06529022-4b63-48af-a3e4-feee4d8a18d4] => 14
---
>             [06529022-4b63-48af-a3e4-feee4d8a18d4] => 11
214c214
<             [13d41697-9e0e-4b8a-9061-9074b69fb8ce] => 15
---
>             [13d41697-9e0e-4b8a-9061-9074b69fb8ce] => 12
230c230
<             [9e968f7b-ef26-491d-a5b9-09368244e8f9] => 16
---
>             [9e968f7b-ef26-491d-a5b9-09368244e8f9] => 13
246c246
<             [f0f66ff0-2a91-4f3f-9391-3fdc1b43e90e] => 17
---
>             [f0f66ff0-2a91-4f3f-9391-3fdc1b43e90e] => 14
262c262
<             [b96118cf-283e-43fd-8b98-48dbb2bddb22] => 18
---
>             [b96118cf-283e-43fd-8b98-48dbb2bddb22] => 15
278c278
<             [8f602650-9fdb-4b98-8eae-01a8f1f64834] => 19
---
>             [8f602650-9fdb-4b98-8eae-01a8f1f64834] => 16
293c293
<             [7a3e4ebd-55c0-4848-a3bd-78652b6c53ce] => 20
---
>             [7a3e4ebd-55c0-4848-a3bd-78652b6c53ce] => 17
308c308
<             [24bed4cb-a7d1-42e7-a179-4d2144c17804] => 21
---
>             [24bed4cb-a7d1-42e7-a179-4d2144c17804] => 18
323c323
<             [84a3cacd-6de6-44bb-9935-ad7483360e5e] => 22
---
>             [84a3cacd-6de6-44bb-9935-ad7483360e5e] => 19
338c338
<             [ca09fb5a-4663-4683-9188-08002fcffc49] => 23
---
>             [ca09fb5a-4663-4683-9188-08002fcffc49] => 20
353c353
<             [ca87e13e-2649-401a-ae0b-beb631c57b8e] => 24
---
>             [ca87e13e-2649-401a-ae0b-beb631c57b8e] => 21
368c368
<             [dd49ad44-76ba-452f-9171-6a4d91913069] => 25
---
>             [dd49ad44-76ba-452f-9171-6a4d91913069] => 22
383c383
<             [a615b603-122b-4c1e-b3c7-087113aa1663] => 26
---
>             [a615b603-122b-4c1e-b3c7-087113aa1663] => 23
398c398
<             [8a877522-f045-46f5-aecd-6800db395f03] => 27
---
>             [8a877522-f045-46f5-aecd-6800db395f03] => 24
413c413
<             [e27fc1cf-7eea-41f8-9cb1-b3f43d22114e] => 28
---
>             [e27fc1cf-7eea-41f8-9cb1-b3f43d22114e] => 25
428c428
<             [c9d17b45-91eb-4998-ac89-19a55009d060] => 29
---
>             [c9d17b45-91eb-4998-ac89-19a55009d060] => 26
443c443
<             [923a3ad3-62aa-4d09-b129-11dd695b9d3c] => 30
---
>             [923a3ad3-62aa-4d09-b129-11dd695b9d3c] => 27

I'm working my way through the code but again, any help would be greatly appreciated.

webdeck commented 7 years ago

I am hoping this PR on HAP-NodeJS addresses the underlying issue: https://github.com/KhaosT/HAP-NodeJS/pull/491

Also, for reference, the issue reported in homebridge project is: https://github.com/nfarina/homebridge/issues/1295